example

1. Context

If you encounter any error, feel free to enter an issue on GitHub.

1.1. Context diagram

1.2. What is this softare system about ?

Getting "true" public transport delay is often impossible. Because the organism responsible for displaying delays is also the organism that allow transports to run. As a consequence, it is not always their best move to display effective transport delay. This system exists to overcome that.

1.3. What is it that’s being built?

We will build a system that allow us to easily compare official time table with effective transport delay by

Asking users in transports to tell us if transport is late or not
Comparing effective transport location with the one provided by real-time location, when it exists

1.4. How does it fit into the existing environment? (e.g. systems, business

processes, etc)

As we’re doing this as a startup, we have no internal context. However, there is an external context.

We will use services of navitia, which provides time tables for all France cities public transport systems, but also intercity train.

We will also use geolocation services provided by SNCF (for intercity trains) and other providers wher possible.

1.5. Who is using it? (users, roles, actors, personas, etc)

We currently envision two types of useers (who can in fact be the same person at different times)

The waiting user, to which we will send accurate crossing schedule information.
The user already in transport, which can inform waiting user if the train was on time or not

2. Functional Overview

If you encounter any error, feel free to enter an issue on GitHub.

This system allows a person waiting for a public transport to have informations on transport schedule, as provided by people upstream in the same public transport. Imagine that as a crowdsourced SMS from a friend.

Features are quite simple.

2.1. For a waiting person

When someone waits for a transport, kafkatrain detects (from user location and timetables) which are the possible transports the person wants to go in. If multiple transports matches, the application allows user to select which transport he is waiting for. Once a transport has been selected, information from upstream users is presented. This information will typically take the form of

At stop "name of stop", transport was "on time|late by n minutes"

As of today, we don’t envision the UI of application

2.2. For someone in a transport

Once the user is in the public transport, and the transport is moving, application simply sends a message notifying the system that transport is on way.

3. Quality Attributes

If you encounter any error, feel free to enter an issue on GitHub.

3.1. Common constraints

Performance (e.g. latency and throughput): All users should have application available in 5 seconds.
Scalability (e.g. data and traffic volumes): We expect a first deployment with 1.000 users (and 100 simultaneous connections of users)
Availability (e.g. uptime, downtime, scheduled maintenance, 24x7,99.9%, etc): Application will initially be available on a 99.9% basis
Security (e.g. authentication, authorisation, data confidentiality, etc): No user data should be stored by system. Authentication and authorization will be managed using OpenID Connect with the usual id providers (Google, Facebook, …)
Extensibility: Application will be extended to various geographic areas and types of public transport systems (buses, trains, boats), but there will be no feature extensibility
Flexibility: Application is not supposed to be flexible
Auditing: We must be able to audit the timetables provided by Navitia as well as the real-time positions. We must also be able to audit what informations users in transit send to blacklist the ones that will (because shit happens) try to abuse the system.
Monitoring and management: Usual system monitoring will be used.
Reliability: Besides, we will monitor the number of users in transit and waiting and the delay between the time when one user starts his transit and another, on the same line, receive the delay information.
Failover/disaster recovery targets (e.g. manual vs automatic, how long will this take?): We should be able to recover data center loss in less than one day.
Business continuity: N/A
Interoperability: N/A
Legal, compliance and regulatory requirements (e.g. data protection act): N/A
Internationalisation (i18n) and localisation (L10n): Application will be deployed in all countries where public transport systems provide APIs (and have potential delays). For the sake of easiness, application will be first deployed in France.
Accessibility: Don’t know how to validate that.
Usability: Don’t know how to validate that.

3.2. Specifc constraints

These constraints maps to the relationships expressed in Context

3.2.1. Ingesting time tables

Ingesting time tables from Navitia should be done each day. The process should be monitored since there should be no day where this data is not ingested. THis ingestion should be done prior the first release (application has no interest without that).

3.2.2. Ingestion real-time positions

This should be done in a continuous stream. Ingestion should not have delay greater than one minute. Application should be able to work without this information.

3.2.3. Seeing train delays

Delays should be communicated to user in less than 1 s. If user connection to system dont allow that, delays will be sent to user with a notification indicating that network is not performant enough to have accurate timetables.

3.2.4. Informing application that transport is running

A transport is considered as moving after 5 seconds of continuous move. After this delay, signal should be sent in less than 1 s.

4. Constraints

If you encounter any error, feel free to enter an issue on GitHub.

4.1. Common constraints

Time, budget and resources: Project will be built by Nicolas Delsaux and Logan Hauspie. Budget is zero, as it is an example. Application is expected to be delivered … one day
Approved technology lists and technology constraints: Server-side components of application will be depolyed as containers.
Target deployment platform: Application will be deployed on Google Kubernetes Cluster.
Existing systems and integration standards: TODO
Local standards (e.g. development, coding, etc): TODO
Public standards (e.g. HTTP, SOAP, XML, XML Schema, WSDL, etc): TODO
Standard protocols: TODO
Standard message formats: TODO
Size of the software development team: Two persons at best
Skill profile of the software development team: Developers are skilled on server-side, less on front-end.
Nature of the software being built (e.g. tactical or strategic): Strategic, as it is the only product of our startup.
Political constraints: TODO
Use of internal intellectual property: TODO

5. Principles

If you encounter any error, feel free to enter an issue on GitHub.

Team will adhere to the following set of principles.

As it is an example project, we follow the programming, motherfucker methodology.
There should be no operationnal management cost, so
- Application should be auto-redeployed
- Application should be self-healing
Application should use messaging and async systems as much as possible
Interfaces between components should be documented

6. Software Architecture

If you encounter any error, feel free to enter an issue on GitHub.

We use Kafka to fully isolate load between sncfReader and storage. We use ElasticSearch to provide various search directions, as we will request search based on geographic criterais as well as proximity.

6.1. Software architecture of sncfReader component

This one is quite simple : one verticle reads data from Navitia HTTP endpoint, send the obtained data through Vert.x event bus to another which outpus data to a Kafka stream.

7. Code

If you encounter any error, feel free to enter an issue on GitHub.

7.1. kafkatrain

See on GitHub

7.1.1. Avoir un train à l’heure, c’est kafkaïen

Repository principal de notre présentation à Snowcamp 2019

What does this repository contains ?

src/build contains various build scripts
- 0-install.sh installs the environment, provided the secrets are known
- 1-write-reader-code.bat copies reader code in its own repository
- 2-write-web-ui.bat copies web ui in its own repository
- delete.bat deletes the cluster, and the various generated projects
src/k8s contains all deployed into k8s cluster
- elastic provides ingresses for Kibana and Elasticsearch (DON’T DO THAT IN PROD)
- kafka installs all additionnal applications for kafka

Contributing

Fork it (<https://github.com/Riduidel/snowcamp-2019/fork>)
Create your feature branch (git checkout -b feature/fooBar)
Commit your changes (git commit -am 'Add some fooBar')
Push to the branch (git push origin feature/fooBar)
Create a new Pull Request

7.1.2. sncf-reader

sncf-reader application allows download SNCF train schedule from navitia. Since we try to write data into Kafka, we could have written a command-line application that was restarted once ina while. But we prefer, in order to have some kind of entreprisey system, start a vert.x application, because it is simple, and fast (vert fast, indeed).

See on GitHub

sncf-reader application

This application allows us to inject SNCF timesheets into our Kafka engine, for later processing.

Configuration

This application requires the following environment variables to be set

SNCF_READER_TOKEN access token for Navitia API
SNCF_READER_READ_AT_STARTUP When set to true, immediatly start reading SNCF timesheet
SNCF_READER_KAFKA_BOOTSTRAP_SERVER url of Kafka server to connect to
SNCF_READER_TOPIC_SCHEDULE Topic where to post schedule. Defaults to sncfReaderSchedule

7.1.3. web-ui

This container is responsible for displaying timetables in a "nice" UI. This is the simplest possible Javascript application one can imagine:

the server-side component (all in server.js) provides
- a route to display the index.html page.
- a route to allow search in elsaticsearch index using specific criterias
the client-side component (all in index.html) displays timetables from search engine results

See on GitHub

node

Simple Hello World that listens on localhost:8080

8. Data

If you encounter any error, feel free to enter an issue on GitHub.

All kafkatrain data is stored in ElasticSearch, in the format provided by Navitia. As it is an example, no particular backup/persistence/optimization is provided.

9. Infrastructure Architecture

If you encounter any error, feel free to enter an issue on GitHub.

Is there a clear physical architecture?: All application components will be deployed on Google Kubernetes cluster
What hardware (virtual or physical) does this include across all tiers?: Depends upon how Kubernetes will deploy our application.
Does it cater for redundancy, failover and disaster recovery if applicable?: As well as what Google provides
Is it clear how the chosen hardware components have been sized and selected?: We will use the standard Google Kubernetes machines
If multiple servers and sites are used, what are the network links between them?: Network links between machines in a given Google datacenter, no more, no less.
Who is responsible for support and maintenance of the infrastructure?: Google
Are there central teams to look after common infrastructure (e.g. databases, message buses, application servers, etworks, routers, switches, load balancers, reverse proxies, internet connections, etc)?: Google, again
Who owns the resources?: Google, once again
Are there sufficient environments for development, testing, acceptance, pre-production, production, etc?: I hope so

10. Deployment

If you encounter any error, feel free to enter an issue on GitHub.

Softare system will be deployed on Kubernetes using Jenkins-X (itself already installed as operator on Kubernetes). As this example is not live, there is no real deployment that would allow auto-discovery. As a consequence, deployment diagram will be only "virtual" (in the sense that it is simply written).

11. Development Environment

If you encounter any error, feel free to enter an issue on GitHub.

TODO

12. Operation and Support

If you encounter any error, feel free to enter an issue on GitHub.

TODO

13. Decision Log

If you encounter any error, feel free to enter an issue on GitHub.

13.1. kafkatrain

13.1.1. Decisions

Fetched from agile-architecture-documentation-example issues with label "decision"

How should we store decisions?

Ticket closed on Jun 18, 2020

Everybody loves the fact to store the decisions (see the architecture decision record practices). But should we store them as simple text ?

Alternative: Use simple texts, as designed in ADR practice

Default agile architecture documentation template uses simple asciidoc pages following a template.

Advantages

This is simple to write
This lives along documentation
This can be easily updated

Drawbacks

Decisions come and go, and simple texts is not so good at conveying change
A decision is separated in various phase (see OODA method) and this is very baldy formatted in a simple text

Alternative: We could use issue tracker

Advantages

It separates the various phases of discussion (discussed, adopted, superseed)
It allow each alternative to be clearly viewed
We can select interesting parts of discussion

Drawbacks

Converting an issue to a text is not so trivial
We become dependent of an other external system

Decision

We will use decisions in GitHub issues, to allow better tracability/UI, and so on

example

1. Context

1.1. Context diagram

1.2. What is this softare system about ?

1.3. What is it that’s being built?

1.4. How does it fit into the existing environment? (e.g. systems, business

1.5. Who is using it? (users, roles, actors, personas, etc)

2. Functional Overview

2.1. For a waiting person

2.2. For someone in a transport

3. Quality Attributes

3.1. Common constraints

3.2. Specifc constraints

3.2.1. Ingesting time tables

3.2.2. Ingestion real-time positions

3.2.3. Seeing train delays

3.2.4. Informing application that transport is running

4. Constraints

4.1. Common constraints

5. Principles

6. Software Architecture

6.1. Software architecture of sncfReader component

7. Code

7.1. kafkatrain

7.1.1. Avoir un train à l’heure, c’est kafkaïen

What does this repository contains ?

Meta

Contributing

7.1.2. sncf-reader

sncf-reader application

Configuration

7.1.3. web-ui

node

8. Data

9. Infrastructure Architecture

10. Deployment

11. Development Environment

12. Operation and Support

13. Decision Log

13.1. kafkatrain

13.1.1. Decisions

How should we store decisions?

Alternative: Use simple texts, as designed in ADR practice

Alternative: We could use issue tracker

Decision