How we keep EXT:solr uptodate with the TYPO3 core

This is a summary of my talk for the TYPO3 developer days in Malmö «How we keep EXT:solr up to date with the TYPO3 core».

Our challenge

TYPO3 8 LTS was announced to be released on the 4th April 2017. Our plan and challenge were to release our Apache Solr for TYPO3 extensions 2 days later. There are several side effects that are important for us to keep in mind:

  • The TYPO3 core is moving forwards, especially shortly after the release
  • EXT:solr is moving also especially before the release
  • We want to stay compatible to 7.6 while being ready for 8.7
  • We have several add-ons that depend on EXT:solr (solrfal and solrfluid) that need to work in this combination too
  • Several PHP versions need to be supported
  • MySQL and Apache Solr also need to be compatible with the combinations

Conclusion: In the end, everything needs to perform together to have a running search in TYPO3.

How we tackled it

To cover the previously mentioned side effects we did the following:

  • We test all combinations (PHP, Solr, TYPO3) on every change
  • We test them before merging into EXT:solr (on pull request level)
  • We test continuously with the TYPO3 Core
  • We attend the TYPO3 core stabilization sprint in Aarhus

To implement the technical mechanism above we use GitHub pull requests for every change (no direct push to any branch). Every pull request is checked by Travis-Ci.org and Scrutinizer-ci.com to help us to find problems with a change before it get’s merged.

EXT:solr

Important: The goal is not to blame anyone's changes. The goal is to keep EXT:solr running with all combinations!

So if you want to start contributing to EXT:solr you can create a fork and enable travis-ci.org and scrutinizer-ci.com on your fork, and run these things even before creating a pull request.

Introduction into Travis-ci.org

Travis-ci.org is a continuous integration platform that can run builds configured with a simple YAML file. A build contains checks on the software that make sure that your software works as expected. In the case of EXT:solr we currently run:


  • PHPLinter: Checks if there is any PHP file with an invalid syntax
  • PHP-cs-fixer: Checks if there is any none prs-2 formatted file
  • Unit tests:
    • PHP Tests, that check if smaller units of codes (Classes or Methods) works as expected.
    • Have no dependencies on a certain existing environment like database or solr server
    • Run fast and give quick feedback
    • Require loosely coupled code (requires refactoring of legacy code)
  • Integration/Functional tests:
    • PHP Tests, that cover bigger parts of the application.
    • Are easier to write and often not require to restructure the code
    • But have dependencies on e.g. Database or Solr server

The goal should be for sure, to cover as much as you can with unit tests because they run much quicker. We will come back later on that topic when we will have a look on the refactoring. But first, we will have a closer look on travis-ci.org

Closer look on .travis.yml

language php  

This defines the language of your project. In our case php.

addons:  
   apt:

      packages:

         - parallel

In the add-ons section, you can define additional things that should be available in your testing container. In our case, we install the parallel package that we use to run the PHP linter in parallel.

env:  
   global:

      - TYPO3_DATABASE_NAME="typo3_ci"

      - TYPO3_DATABASE_HOST="127.0.0.1"

      - TYPO3_DATABASE_USERNAME="root"

      - TYPO3_DATABASE_PASSWORD=""


In the env section, you can define environment variable that you need during the test. We define the environment variables here, that we use for the TYPO3 testing framework. We use them in our bootstrap.sh script to set them for the testing framework.

matrix:  
   fast_finish: true

   include:

      - php: 5.5

        env: TYPO3_VERSION=^7.6

      - php: 5.6

        env: TYPO3_VERSION=^7.6

      - php: 7.0

        env: TYPO3_VERSION=^7.6

      - php: 7.1

        env: TYPO3_VERSION=^7.6

      - php: 7.1

        env: TYPO3_VERSION=^8.7

      - php: 7.0

        env: TYPO3_VERSION=^8.7

In the matrix section, we define the combinations of PHP and TYPO3 that we want to use when we execute our builds. Each combination will create an own test container on Travis-ci. As we develop for dev-master, we can add this here as well.

before_install:  
   - composer self-update

install:  
   - Build/Test/bootstrap.sh

script:  
   - Build/Test/cibuild.sh

after_script:  
   - Build/Test/publish_coverage.sh

There are several build steps that we use to do different things.

  • before install: We make sure that composer is update
  • install: We bootstrap the environment (install solr, PHP-cs-fixer and other tools that we need during the build)
  • script: Here we run the actual test suite
  • after_script: Here we send the collected code coverage from travis-ci.org to scrutinizer-ci.com using ocular
cache:  
   directories:

      - $HOME/.composer/cache

Last but not least we store the composer cache in a Travis cache to avoid the download in every build.

Introduction into Scrutnizer-ci.com

Scrutinizer-ci.com is a continuous integration tool that helps us to measure and visualize code quality:

  • Every project get’s a simple rating between 0 and 10 that indicates the quality of the code
  • Proposes refactoring for complex code parts
  • Can find bugs and proposes fixes
  • Creates reports of changes (Issues, Code coverage, Quality Rating) that help to check the trend of these metrics
  • Free for open source projects
  • Configured in on .scrutinizer.yml file

Closer look on .scrutinizer.yml

The configuration also happens in a simple YAML file. In our project we use the following parts:



checks:  
   php:

      code_rating: true

      duplication: true

We want to get a code rating and check for duplications.

filter:  
   paths:

      - 'Classes/*'


We evaluate the classes below Classes since the other PHP files return only typo related stuff where the structure is fixed.

tools:  
   php_cpd:

      enabled: true



   php_code_sniffer:

      enabled: true

      config:

         standard: TYPO3CMS



   # we do this on travis

   php_cs_fixer:

      enabled: false



   php_mess_detector:

      enabled: true

      config:

         controversial_rules:

            superglobals: false



   php_pdepend:

      enabled: true



   # coverage pushed from travis-ci.org

   external_code_coverage:

      # two runs unit and integration

      runs: 2

      timeout: 1800

In the tools section, the underlying PHP QA tools can be configured. In our case this means

  • We use PHP copy and past detected
  • We use the PHP code sniffer with the TYPO3 CMS standard
  • We disable PHP-cs-fixer since we run this on Travis
  • We use PHP mess detector (phpmessdetector) but disables the superglobal check since they require sometimes (e.g. for $GLOBALS[‚TSFE‘] …)
  • We use PHPDepend (php_pdepend)
  • We wait for two code coverage files that are sent to scrutinizer (one for unit tests, one the integration tests)
build_failure_conditions:  
   - 'patches.label("Doc Comments").count > 0'

   - 'patches.label("Spacing").count > 0'

   - 'issues.label("coding-style").count > 0'

   - 'issues.severity(>= MAJOR).count > 0'

Build failure conditions allow us to define a boundary for a certain metrics that causes the build to fail. This is very handy when you e.g. want to make sure that the quality of an imported project will not get worse over time.

Conclusion

There are a lot of services available around GitHub that allow you to run automated tests and code checks for free. This helps us a lot for EXT:solr to make sure the extension still works with the latest TYPO3 core versions.

We started with adding these tests and improve the ci process over time. This enables us to integrate bigger features faster and reduces the effort for manual testing.

The road to testing paradise

The first lines of EXT:solr are 8 years old now :) Since the way how PHP applications and TYPO3 extension are build was improved a lot, we need to continuously transform the code to keep it update. This means we need to change the code continuously while keeping it doing, what it does before. Automated tests help us to change the code while making sure, it works as before.

The ideal world

In the ideal developer world, the automated tests look like a pyramid

Ideal developer world

  • At the bottom you have a high amount of quick running unit tests that give you quick feedback when a smaller part of the code is not working as before.
  • In the middle you have a medium amount of integration/functional test, that test bigger parts of the application but also take longer to run.
  • At the top you have a few acceptance tests that test the application from outside (e.g. the website with selenium) and do end to end test.

The execution path is from bottom to top. Then the tests fail as quickly as possible and give you a feedback on that during the development.

To develop this way you need to full fill some requirements:

  • You need to have loosely coupled code to add unit tests. This means your code needs to get dependencies from outside instead of creating them by its own. When code is written this way e.g. by using a „decency injection container“, dependencies could be faked to only test the relevant code
  • You need to have a disciplined development team that what’s to work in this way.

On the other hand, this also means:

  • It is hard to add unit tests for older (legacy) code since it often creates the dependencies by its own. Refactoring is required to set those dependencies from outside

What we do in EXT:solr to improve

To improve these automated tests we currently first add integration tests, that cover bigger parts of the application.

Developers world

Afterwards we restructure the code and also add unit tests for the new components:

Developers world

This has the following advantages:

  • Refactorings are already covered by integration tests.
  • After adding unit test we also get a quick feedback on these parts of the code.

Where we are and where we can get

Currently, we have around 400 unit tests that take 10 seconds to run, the coverage here is still less but should be improved over time. The integration tests take around 15 Minutes but cover bigger parts of the application.

Unit tests

Together around 65% are tested automatically. This is a step in the right direction but still improvable. But nevertheless, we need to keep in mind, that doing all this manual would take days.

What we achieved

With the tools that we described below, we achieved that EXT:solr 6.1.0 could be released as planned on the 6. April, two days after TYPO3 8 LTS. At the same time, we could merge around 160 pull requests from 14 contributors and the trends go into the right direction :)

Statistic and trends

What can be next

By the end of the month we plan to release EXT:solr 7.0.0 with the following highlights:

  • FLUID templating is now default
  • Backend modules are splitter into submodules, which allows you that editors can get access rights to change e.g. stop words or synonyms
  • Multiple plugin instances on the same page and custom plugin namespace are now possible
  • We refactored the Query API to make it easier to use and less complex
  • Several bugfixes

For the second half of the year we have the following things that could potentially be done (as always depending on the budget):

Q3 - Focus: UX improvements:

  • Grouping add-on for FLUID (EB only)
  • Improve out-of-the-box facets (e.g. Color facet, tag cloud …)
  • Migrate default templates to bootstrap CSS
  • Extend index inspector functionality

Q4 - Focus: Latest Apache Solr Version
:

  • Add support for Apache Solr 6.x
  • Extract indexing API in EXT:solr

For 2018 there is no real roadmap yet but we have several ideas that could potentially be done

  • Indexing Assistant: Simply the indexing of records
  • Semantic/NReach Support
  • Housekeeping: Get ready for 9. Since there is a lot to do just for the doctrine migration this will also take some effort to get EXT:solr ready for TYPO3 9

Semantic services connected with Apache Solr for TYPO3

As shortly mentioned before dkd is working on a semantic service platform nreach.io that provides semantic services for TYPO3. To show you how simple these services can be integrated into TYPO3 and what powerful features are possible,
we’ve implemented a PoC that implements a picture search based on nreach, solr and solrfal. It allows you to search for objects on pictures and also facet on attributes extracted by nreach.

The following video shows how a fal object can be enriched by nreach with semantic data:

And then we use the frontend to search and filter for properties that have been extracted by nreach automatically:

Resources and Links

Here is a collection of links of tools and resources mentioned in the post above:

Summary

As you see there are several things we did to provide a release for TYPO3 8 LTS and we have also quite some ideas for the future. If you want to support us. Go to https://typo3-solr.com or call +49 69 2475218-0 and become a partner today!

von Timo Hund

Timo Hund works as a senior developer at dkd Internet Service GmbH. Over the last year he collected a lot of experiences with Apache Solr and Lucene in several IT projects.