1. Context
If you encounter any error, feel free to enter an issue on GitHub. |
1.2. What is this softare system about ?
Getting "true" public transport delay is often impossible. Because the organism responsible for displaying delays is also the organism that allow transports to run. As a consequence, it is not always their best move to display effective transport delay. This system exists to overcome that.
1.3. What is it that’s being built?
We will build a system that allow us to easily compare official time table with effective transport delay by
-
Asking users in transports to tell us if transport is late or not
-
Comparing effective transport location with the one provided by real-time location, when it exists
1.4. How does it fit into the existing environment? (e.g. systems, business
processes, etc)
As we’re doing this as a startup, we have no internal context. However, there is an external context.
We will use services of navitia, which provides time tables for all France cities public transport systems, but also intercity train.
We will also use geolocation services provided by SNCF (for intercity trains) and other providers wher possible.
1.5. Who is using it? (users, roles, actors, personas, etc)
We currently envision two types of useers (who can in fact be the same person at different times)
-
The waiting user, to which we will send accurate crossing schedule information.
-
The user already in transport, which can inform waiting user if the train was on time or not
2. Functional Overview
If you encounter any error, feel free to enter an issue on GitHub. |
This system allows a person waiting for a public transport to have informations on transport schedule, as provided by people upstream in the same public transport. Imagine that as a crowdsourced SMS from a friend.
Features are quite simple.
2.1. For a waiting person
When someone waits for a transport, kafkatrain detects (from user location and timetables) which are the possible transports the person wants to go in. If multiple transports matches, the application allows user to select which transport he is waiting for. Once a transport has been selected, information from upstream users is presented. This information will typically take the form of
At stop "name of stop", transport was "on time|late by n minutes"
As of today, we don’t envision the UI of application |
3. Quality Attributes
If you encounter any error, feel free to enter an issue on GitHub. |
3.1. Common constraints
- Performance (e.g. latency and throughput)
-
All users should have application available in 5 seconds.
- Scalability (e.g. data and traffic volumes)
-
We expect a first deployment with 1.000 users (and 100 simultaneous connections of users)
- Availability (e.g. uptime, downtime, scheduled maintenance, 24x7,99.9%, etc)
-
Application will initially be available on a 99.9% basis
- Security (e.g. authentication, authorisation, data confidentiality, etc)
-
No user data should be stored by system. Authentication and authorization will be managed using OpenID Connect with the usual id providers (Google, Facebook, …)
- Extensibility
-
Application will be extended to various geographic areas and types of public transport systems (buses, trains, boats), but there will be no feature extensibility
- Flexibility
-
Application is not supposed to be flexible
- Auditing
-
We must be able to audit the timetables provided by Navitia as well as the real-time positions. We must also be able to audit what informations users in transit send to blacklist the ones that will (because shit happens) try to abuse the system.
- Monitoring and management
-
Usual system monitoring will be used.
- Reliability
-
Besides, we will monitor the number of users in transit and waiting and the delay between the time when one user starts his transit and another, on the same line, receive the delay information.
- Failover/disaster recovery targets (e.g. manual vs automatic, how long will this take?)
-
We should be able to recover data center loss in less than one day.
- Business continuity
-
N/A
- Interoperability
-
N/A
- Legal, compliance and regulatory requirements (e.g. data protection act)
-
N/A
- Internationalisation (i18n) and localisation (L10n)
-
Application will be deployed in all countries where public transport systems provide APIs (and have potential delays). For the sake of easiness, application will be first deployed in France.
- Accessibility
-
Don’t know how to validate that.
- Usability
-
Don’t know how to validate that.
3.2. Specifc constraints
These constraints maps to the relationships expressed in Context
3.2.1. Ingesting time tables
Ingesting time tables from Navitia should be done each day. The process should be monitored since there should be no day where this data is not ingested. THis ingestion should be done prior the first release (application has no interest without that).
3.2.2. Ingestion real-time positions
This should be done in a continuous stream. Ingestion should not have delay greater than one minute. Application should be able to work without this information.
4. Constraints
If you encounter any error, feel free to enter an issue on GitHub. |
4.1. Common constraints
- Time, budget and resources
-
Project will be built by Nicolas Delsaux and Logan Hauspie. Budget is zero, as it is an example. Application is expected to be delivered … one day
- Approved technology lists and technology constraints
-
Server-side components of application will be depolyed as containers.
- Target deployment platform
-
Application will be deployed on Google Kubernetes Cluster.
- Existing systems and integration standards
-
TODO
- Local standards (e.g. development, coding, etc)
-
TODO
- Public standards (e.g. HTTP, SOAP, XML, XML Schema, WSDL, etc)
-
TODO
- Standard protocols
-
TODO
- Standard message formats
-
TODO
- Size of the software development team
-
Two persons at best
- Skill profile of the software development team
-
Developers are skilled on server-side, less on front-end.
- Nature of the software being built (e.g. tactical or strategic)
-
Strategic, as it is the only product of our startup.
- Political constraints
-
TODO
- Use of internal intellectual property
-
TODO
5. Principles
If you encounter any error, feel free to enter an issue on GitHub. |
Team will adhere to the following set of principles.
-
As it is an example project, we follow the programming, motherfucker methodology.
-
There should be no operationnal management cost, so
-
Application should be auto-redeployed
-
Application should be self-healing
-
-
Application should use messaging and async systems as much as possible
-
Interfaces between components should be documented
6. Software Architecture
If you encounter any error, feel free to enter an issue on GitHub. |
We use Kafka to fully isolate load between sncfReader and storage. We use ElasticSearch to provide various search directions, as we will request search based on geographic criterais as well as proximity.
7. Code
If you encounter any error, feel free to enter an issue on GitHub. |
7.1. kafkatrain
7.1.1. Avoir un train à l’heure, c’est kafkaïen
Repository principal de notre présentation à Snowcamp 2019
What does this repository contains ?
-
src/build
contains various build scripts-
0-install.sh
installs the environment, provided the secrets are known -
1-write-reader-code.bat
copies reader code in its own repository -
2-write-web-ui.bat
copies web ui in its own repository -
delete.bat
deletes the cluster, and the various generated projects
-
-
src/k8s
contains all deployed into k8s cluster-
elastic
provides ingresses for Kibana and Elasticsearch (DON’T DO THAT IN PROD) -
kafka
installs all additionnal applications for kafka
-
7.1.2. sncf-reader
sncf-reader application allows download SNCF train schedule from navitia. Since we try to write data into Kafka, we could have written a command-line application that was restarted once ina while. But we prefer, in order to have some kind of entreprisey system, start a vert.x application, because it is simple, and fast (vert fast, indeed).
sncf-reader application
This application allows us to inject SNCF timesheets into our Kafka engine, for later processing.
Configuration
This application requires the following environment variables to be set
-
SNCF_READER_TOKEN
access token for Navitia API -
SNCF_READER_READ_AT_STARTUP
When set to true, immediatly start reading SNCF timesheet -
SNCF_READER_KAFKA_BOOTSTRAP_SERVER
url of Kafka server to connect to -
SNCF_READER_TOPIC_SCHEDULE
Topic where to post schedule. Defaults tosncfReaderSchedule
7.1.3. web-ui
This container is responsible for displaying timetables in a "nice" UI. This is the simplest possible Javascript application one can imagine:
-
the server-side component (all in
server.js
) provides-
a route to display the
index.html
page. -
a route to allow search in elsaticsearch index using specific criterias
-
-
the client-side component (all in
index.html
) displays timetables from search engine results
8. Data
If you encounter any error, feel free to enter an issue on GitHub. |
All kafkatrain data is stored in ElasticSearch, in the format provided by Navitia. As it is an example, no particular backup/persistence/optimization is provided.
9. Infrastructure Architecture
If you encounter any error, feel free to enter an issue on GitHub. |
- Is there a clear physical architecture?
-
All application components will be deployed on Google Kubernetes cluster
- What hardware (virtual or physical) does this include across all tiers?
-
Depends upon how Kubernetes will deploy our application.
- Does it cater for redundancy, failover and disaster recovery if applicable?
-
As well as what Google provides
- Is it clear how the chosen hardware components have been sized and selected?
-
We will use the standard Google Kubernetes machines
- If multiple servers and sites are used, what are the network links between them?
-
Network links between machines in a given Google datacenter, no more, no less.
- Who is responsible for support and maintenance of the infrastructure?
-
Google
- Are there central teams to look after common infrastructure (e.g. databases, message buses, application servers, etworks, routers, switches, load balancers, reverse proxies, internet connections, etc)?
-
Google, again
- Who owns the resources?
-
Google, once again
- Are there sufficient environments for development, testing, acceptance, pre-production, production, etc?
-
I hope so
10. Deployment
If you encounter any error, feel free to enter an issue on GitHub. |
Softare system will be deployed on Kubernetes using Jenkins-X (itself already installed as operator on Kubernetes). As this example is not live, there is no real deployment that would allow auto-discovery. As a consequence, deployment diagram will be only "virtual" (in the sense that it is simply written).
11. Development Environment
If you encounter any error, feel free to enter an issue on GitHub. |
TODO
12. Operation and Support
If you encounter any error, feel free to enter an issue on GitHub. |
TODO
13. Decision Log
If you encounter any error, feel free to enter an issue on GitHub. |
13.1. kafkatrain
13.1.1. Decisions
Fetched from agile-architecture-documentation-example issues with label "decision"
How should we store decisions?
Ticket closed on Jun 18, 2020
Everybody loves the fact to store the decisions (see the architecture decision record practices). But should we store them as simple text ?
Alternative: Use simple texts, as designed in ADR practice
Default agile architecture documentation template uses simple asciidoc pages following a template.
Advantages
-
This is simple to write
-
This lives along documentation
-
This can be easily updated
Drawbacks
-
Decisions come and go, and simple texts is not so good at conveying change
-
A decision is separated in various phase (see OODA method) and this is very baldy formatted in a simple text
Alternative: We could use issue tracker
Advantages
-
It separates the various phases of discussion (discussed, adopted, superseed)
-
It allow each alternative to be clearly viewed
-
We can select interesting parts of discussion
Drawbacks
-
Converting an issue to a text is not so trivial
-
We become dependent of an other external system