Conditional step with Dataflow

DataFlow conditionnal step

Image par Free-Photos de Pixabay


Today, I write on a common subject when you are writing code, the condition statement if . But, not in general case, in our Dataflow bundle case.

If you don't know the Dataflow bundle, please read this tutorial.

The workflow for all Dataflow is linear. You can't fork the flow to make different processing like Symfony Workflow component. It's normal because Dataflow is written to process a flow of data.
But, you can skip a step if a condition is verified.

My old code

This is the legacy code and I want to rewrite this using Dataflow.

 

   

<?php
//[...]
$datas = [...]; 
foreach ($datas as $data) { 
    $code = trim($data['code']);
    switch ($code) {
        case '00':
            $savedData = [
                'name' => trim($data['name']) . ' ' . trim($data['lastname']),
                'email' => strpos($data['email'], '@') ? $data['email'] : null,
            ];
            $keys = [
                'external_id' => $code . $data['id'],
            ];
            $db->insertOrUpdate('author', $savedData, $keys);
            break;
        case '01':
            $savedData = [
                'name' => trim($data['name']) . ' ' . trim($data['lastname']),
                'email' => strpos($data['email'], '@') ? $data['email'] : null,
                'comment' => strip_tags($data['comment']),
                'is_spam' => boolval($data['spam']),
                'approved' => boolval($data['public']),
                'related_article' => '02'.$data['article_id'],
            ];
            $keys = [
                'external_id' => $code . trim($data['id']),
            ];
            $db->insertOrUpdate('article_comment', $savedData, $keys);
            break; 
        case '02':
            $savedData = [
                'title' => trim($data['title']),
                'content' => strip_tags($data['content']),
                'note' => intval($data['stars']),
                'author' => '00'.$data['author_id'],
            ];
            $keys = [
                'external_id' => $code . trim($data['provider']),
            ];
            $db->insertOrUpdate('article', $savedData, $keys);
            break;
    }
}
//[...] 
   


In the legacy code, the process is not the same for each switch case. And the line to insert or update in the database is duplicated. This feature will be provided by the writer.

With Dataflow, the foreach  statement can be removed because it's the responsability of the reader to provide data line by line. For more convenience, the reader returns an array but you can use an object instead.

The first idea is adding the big switch case in one step. But it's a bad idea. This example is relatively simple. In real life, it's not.

The future legacy code

The best choice is adding each case in one step, with a first step to prepare the array.

   

//[...]
class ArticlesDataflowType extends AbstractDataflowType
{
    protected function buildDataflow(DataflowBuilder $builder, array $options): void
    {
        //[...]
        $builder->addStep(function ($row) {
            $row = ['original_datas' => $row, 'new_datas' => []];
            $row['code'] = trim($row['original_datas']['code']);
            return $row;
        })
            ->addStep(function ($data) {
                if ($data['code'] !== '00') {
                    return $data;
                }
                $data['new_datas'] = [
                    'name' => trim($data['original_datas']['name']) . ' ' . trim($data['original_datas']['lastname']),
                    'email' => strpos($data['original_datas']['email'], '@') ? $data['original_datas']['email'] : null,
                    'external_id' => $data['code'] . $data['original_datas']['id'],
                ];
                return $data;
            })
            ->addStep(function ($data) {
                if ($data['code'] !== '01') {
                    return $data;
                }
                $data['new_datas'] = [
                    'name' => trim($data['original_datas']['name']) . ' ' . trim($data['original_datas']['lastname']),
                    'email' => strpos($data['original_datas']['email'], '@') ? $data['original_datas']['email'] : null,
                    'comment' => strip_tags($data['original_datas']['comment']),
                    'is_spam' => boolval($data['original_datas']['spam']),
                    'approved' => boolval($data['original_datas']['public']),
                    'related_article' => '02'.$data['original_datas']['article_id'],
                    'external_id' => $data['code'] . $data['original_datas']['id'],
                ];
                return $data;
            })
            ->addStep(function ($data) {
                if ($data['code'] !== '02') {
                    return $data;
                }
                $data['new_datas'] = [
                    'title' => trim($data['original_datas']['title']),
                    'content' => strip_tags($data['original_datas']['content']),
                    'note' => intval($data['original_datas']['stars']),
                    'author' => '00'.$data['original_datas']['author_id'],
                    'external_id' => $data['code'] . $data['original_datas']['provider'],
                ];
                return $data;
            })
        ;
        //[...]
    }
} 
   

In the first step, I create a new array to add a structure for source data, a new data array used to save in the database and return it.

In each step, you can view only one if  statement to check if the step will process the data. According to the principle of early return, if the condition is not verified, the data is returned immediately without any change.
In this example, the default case is not implemented. You can filter all unprocessed data by adding the last step like this:

   

->addStep(function ($data) {
    return empty($data['new_datas']) ? false : $data;
}); 
   

All data provided by the reader and left unprocessed is dropped by this step.

In conclusion, Dataflow provides a framework to structure and organize your data processing. In the futures articles, you can discover all the power of this bundle.

Crédit photo : Free-Photos de Pixabay