CI assignment

How to run

You need to have composer installed to run this code. The code is written in PHP 7 and has been tested on Postgres 9.5

# this step will ask you for the database credentials
composer install

# creates database tables and views
./bin/console doctrine:migrations:migrate 

# to import the provided sample graph data:
./bin/console ci_test:importer:import

# to import custom graph data from url
./bin/console ci_test:importer:import -d http://yourdomain.com/graph.xml

# to run a json query (data file is included)
./bin/console ci_test:importer:query < data/query.sample.json

Answers to questions

The XML library I chose is JMS Serializer, because using this library I'm able to work with the same objects while deserializing the input and persisting the data in the database using Doctrine. The library would cause problems with big input files, but nobody expected big input files in the test and I saved quite a lot of time, because I didn't need to write a custom parser.

Also it is worth to mention, that I'm not checking the correctness of the input file, I'm validating domain entity data parsed from the file. In this way I was able to achieve separation of concerns - 1) parsing, 2) validation, 3) persisting into database. The code is much cleaner and extensible this way and I was able to reuse some general purpose libraries and didn't have to write the low level code myself.

The SQL schema related to the XML input file of the graph is a 1:1 mapping. It is the best solution to have a normalized graph structure in a relational database (at least in my opinion). The identifiers of the nodes and edges are of the type string, so it is easy to map them to the XML input and it is easy to update them with XML file updates. (not working 100% right now, because nobody required this feature). The nodes and edges are keeping their relation to the corresponding graph, so it is possible to import multiple graph structures (under the condition that the node and edge identifiers stay unique between graphs).
SQL query for finding the cycles in the database:

WITH RECURSIVE paths (node, depth, path, cycle) AS (
  SELECT (SELECT id FROM first_node), 0, ARRAY[(SELECT id FROM first_node)], false
UNION ALL
  SELECT to_id, depth + 1, (path || to_id::text), to_id = ANY(path)
  FROM edge, paths
  WHERE from_id = node AND NOT cycle
), first_node AS (
  SELECT id::text FROM node ORDER BY id OFFSET 0 LIMIT 1
) SELECT path[array_position(path, path[array_upper(path, 1)]):array_upper(path, 1)] FROM paths WHERE cycle

Warning - this query only works if there is at least one edge coming out of each node and if there are no separate clusters in the graph.

I created a view named 'path' which finds all the paths of the graph and can be restricted to find only paths between two endpoints using a WHERE clause. The query is not optimal, because it always calculates all the existing paths between all the nodes. Since this is just a test, I didn't want to optimize the query, because the code will be thrown away anyway. If I wanted to optimize the query properly, so it would only search for the requested paths, I would probably use this method: http://explainextended.com/2010/12/24/postgresql-parametrizing-a-recursive-cte/

I also modified the format of the input json query and the output json format, because the format from the assignment is invalid (there are arrays with string keys in the example). The changes are visible in the sample json file in the data folder.

For JSON parsing/serialization I chose the JMS Serializer library, for the same reasons as mentioned above.

"Are there any paths problem": I used the same query, that is supposed to find cycles in the graph, I just changed the final condition, so the results being returned contain no cycles. The query basically finds all the paths from all the nodes to all the nodes (explained above). I created a view from the query, to be able to query the data using a Doctrine entity (a little hack). The same query also computes the cost of each path and returns all the paths ordered by the computed cost. In this way, I was able to solve the "cheapest path problem", because I'm re-using the same query and appending a LIMIT 1 clause at the end of it.

The details are in the query, which is written in the app/DoctrineMigrations/Version20160214145352.php file. I'm not really sure what more there is to explain, because it is a modified query from the Postgres documentation example for the WITH clause. I only added the path cost computation and unique identifier generation (which made it more convenient for Doctrine).

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
app		app
bin		bin
data		data
src		src
tests/AppBundle/Controller		tests/AppBundle/Controller
var		var
web		web
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
composer.json		composer.json
composer.lock		composer.lock
phpunit.xml.dist		phpunit.xml.dist

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

app

app

bin

bin

data

data

src

src

tests/AppBundle/Controller

tests/AppBundle/Controller

var

var

web

web

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

composer.json

composer.json

composer.lock

composer.lock

phpunit.xml.dist

phpunit.xml.dist

Repository files navigation

CI assignment

How to run

Answers to questions

About

Releases

Packages

Languages

License

Rastusik/ci-assignment

Folders and files

Latest commit

History

Repository files navigation

CI assignment

How to run

Answers to questions

About

Resources

License

Stars

Watchers

Forks

Languages