Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trip-to-trip and route-to-route transfers #284

Merged
merged 8 commits into from
Oct 5, 2021

Conversation

scmcca
Copy link
Contributor

@scmcca scmcca commented Sep 17, 2021

Trip-to-trip transfers as flagged in #278 to define the primary ID for transfers.txt. This PR contains an extension from Google as previously advocated for by @antrim in #32.

Seeing that this is an extension, it would be good for producers and consumers to provide proof of implementation here in the discussion.

Review and further discussions are welcomed. Thanks!

@scmcca scmcca added the GTFS Schedule Issues and Pull Requests that focus on GTFS Schedule label Sep 17, 2021
@google-cla google-cla bot added the cla: yes label Sep 17, 2021
@paulswartz
Copy link
Contributor

@mbta's GTFS provides this data.

@skinkie
Copy link
Contributor

skinkie commented Sep 17, 2021

Would it be too much to ask to include the route-to-route one too? https://developers.google.com/transit/gtfs/reference/gtfs-extensions#TripToTripTransfers

@hannesj
Copy link

hannesj commented Sep 17, 2021

OpenTripPlanner version 2 supports both trip-to-trip and route-to-route, as well as any combination of these and regular stop-to-stop transfers.

@gcamp
Copy link
Contributor

gcamp commented Sep 17, 2021

I wouldn't want to explode the scope of this PR and I think this is a good improvement on it's own but following this it would be good to look into in-seats transfers and how they interact with block_ids where trip to trip transfers can be the basis to build upon.

So +1 but I see this as the start of more changes.

@scmcca
Copy link
Contributor Author

scmcca commented Sep 17, 2021

I added route-to-route transfers in this PR (feel free to review if it makes sense).

I agree that in-seat transfers and block_id should be addressed in a separate PR.

@scmcca scmcca changed the title Trip-to-trip transfers Trip-to-trip and route-to-route transfers Sep 17, 2021
@skinkie
Copy link
Contributor

skinkie commented Sep 17, 2021

@scmcca could you describe in this PR what you consider valid combinations, mandatory combinations and invalid combinations?

@scmcca
Copy link
Contributor Author

scmcca commented Sep 17, 2021

@skinkie Correct me if I'm wrong, but I think any combination is allowed as long as the trip_id and route_id have the stop_id defined in the required from/to_stop_id fields. Can you elaborate on the cases/issues I'm missing?

@skinkie
Copy link
Contributor

skinkie commented Sep 17, 2021

@skinkie Correct me if I'm wrong, but I think any combination is allowed as long as the trip_id and route_id have the stop_id defined in the required from/to_stop_id fields. Can you elaborate on the cases/issues I'm missing?

A from_trip_id that does not match from_route_id? Similar for to_trip_id / to_route_id.

@scmcca
Copy link
Contributor Author

scmcca commented Sep 17, 2021

Good point. I'll add wording to require they match.

@skinkie
Copy link
Contributor

skinkie commented Sep 17, 2021

Good point. I'll add wording to require they match.

I think route + trip could be considered redundant, but still valid if they match.
But I also wonder if trip to route would be valid.

@scmcca
Copy link
Contributor Author

scmcca commented Sep 17, 2021

I think route + trip could be considered redundant, but still valid if they match.
But I also wonder if trip to route would be valid.

I think trip to route would be valid if the trips.route_idto_route_id. This assumes that transfers between the same route is an invalid concept. Or is that a part of the in-seat transfer block_id discussion that I'm missing?

@skinkie
Copy link
Contributor

skinkie commented Sep 17, 2021

This assumes that transfers between the same route is an invalid concept. Or is that a part of the in-seat transfer block_id discussion that I'm missing?

I think the same route in the opposite direction could still be a guaranteed transfer. But I think we need some clear semantics on what can be expected.

@Bertware
Copy link

Bertware commented Sep 20, 2021

I believe we need to define the "specificity" of these transfer rules. If from_trip_id is specified, from_route_id should be ignored because the trip is more specific and the from_route_id wouldn't add anything anyway. I would even favour to forbid the combination of from_trip and from_route in the same row. The same goes for to_route_id and to_trip_id. There should be no room for interpretation on the consumer side.

From the google extension:

The from_route_id and to_route_id fields can contain a route_id (as specified by routes.txt), reducing the scope to which the given transfer applies. If from_route_id is specified, the transfer will only apply to the arriving trip with the given route id, at the given from_stop_id. If to_route_id is specified, the transfer will only apply to the departing trip with given route id, at the given to_stop_id.

The from_trip_id and to_trip_id fields can contain a trip_id, as specified by trips.txt. If from_trip_id is given, from_route_id is ignored, and if to_trip_id is given, to_route_id is ignored. If from_trip_id is specified, the transfer will only apply to the arriving trip with the given trip id, at the given from_stop_id. If to_trip_id is specified, the transfer will only apply to the departing trip with the given trip id, at the given to_stop_id.
Specificity of a transfer

Some transfers are more specific than others. We want to define a simple ranking mechanism to determine when a transfer should be applied. We thus define the "specificity" of a transfer.

The specificity of the source of the transfer is 0 if only from_stop_id is given, 1 if from_route_id is specified, and 2 if from_trip_id is specified. The same applies for target: 0 if only to_stop_id is given, 1 if to_route_id is specified, and 2 if to_trip_id is specified. The sum of these two values gives the specificity of the transfer, between 0 and 4 inclusive. For a given ordered pair of arriving trip and departing trip, the transfer with the greatest specificity that applies between these two trips is chosen. So for any pair of trips, there should NOT be two transfers with equally maximal specificity that could apply.

Example of an ambiguous rule:

from_stop_id,to_stop_id,from_route_id,to_route_id,transfer_type
stopFrom,stopTo,routeFrom,,0
stopFrom,stopTo,,routeTo,1

These two transfers both have a specificity of 1. But for transfers between a trip with id route id routeFrom arriving at stop stopFrom, to a trip with route id routeTo arriving at stop stopTo, either of these two rules can apply.

The resulting priority for picking the right transfer rule becomes the following:
(note that from_stop_id and to_stop_id are always set and used, but I left them out of this ordering example for readability reasons)

  1. both trip_ids set: from_trip_id + to_trip_id
  2. one trip_id and one route_id set: (from_trip_id + to_route_id) or (from_route_id + to_trip_id)
  3. one trip_id set: from_trip_id or to_trip_id
  4. both route_ids set: from_route_id + to_route_id
  5. one route_id set: to_route_id or from_route_id
  6. "normal" transfers: no route or trip related fields set

Samtrafiken/Trafiklab is a producer, our GTFS feeds use the combination from_trip_id + to_trip_id

@scmcca
Copy link
Contributor Author

scmcca commented Sep 20, 2021

@Bertware Thanks for the clear summary of the specificity rules! I'm thinking these could go in the description at the top of the transfers.txt file.

I would even favour forbid the combination of from_trip and from_route in the same row. The same goes for to_route_id and to_trip_id.

I agree. Would it be suitable to introduce the "Conditionally Forbidden" concept here:

Conditionally Forbidden - The field or file is forbidden under certain conditions that are outlined in the field or file description. Otherwise, the field or file is optional.

And then use this to forbid from_route_id with from_trip_id, etc.

scmcca added 3 commits September 20, 2021 11:49
Editorial to reword Conditionally Forbidden to RFC 2119
@scmcca
Copy link
Contributor Author

scmcca commented Sep 27, 2021

It looks like the proposal is stable. Let's open this for a vote. The vote is for adding the following fields in transfers.txt, as outlined in this PR:

  • from_route_id
  • to_route_id
  • from_trip_id
  • to_trip_id

As per the specification amendment process, there is at least one producer (Samtrafiken/Trafiklab) and one consumer (OpenTripPlanner) implementing this extension.

Voting ends on 2021-10-04 at 23:59:59 UTC.

Any remaining feedback is welcomed. Thanks.

@scmcca scmcca added the Status: Voting Pull Requests where the advocate has called for a vote as described in the changes.md label Sep 27, 2021
@skinkie
Copy link
Contributor

skinkie commented Sep 27, 2021

Regarding the conditionally forbidden *_trip_id with *_route_id. If the _route_id would match the _trip_id match, that wouldn't hurt right? My line of thinking: The minimum is stop_id, can be extended to stop_id to route_id, and most specifically stop_id, route_id and trip_id.

@scmcca
Copy link
Contributor Author

scmcca commented Sep 27, 2021

@skinkie

Regarding the conditionally forbidden *_trip_id with *_route_id. If the _route_id would match the _trip_id match, that wouldn't hurt right? My line of thinking: The minimum is stop_id, can be extended to stop_id to route_id, and most specifically stop_id, route_id and trip_id.

The suggestion to provide either *_trip_id or *_route_id, but not both, came from @Bertware in this comment which seems like a good idea:

I would even favour forbid the combination of from_trip and from_route in the same row. The same goes for to_route_id and to_trip_id. There should be no room for interpretation on the consumer side.

However I can see a case for changing this to define that *_trip_id overrides *_route_id when both are defined, as this is what is defined in the original Google proposal:

If from_trip_id is given, from_route_id is ignored, and if to_trip_id is given, to_route_id is ignored.

@Bertware @gcamp, any strong opinions on this?

@skinkie
Copy link
Contributor

skinkie commented Sep 27, 2021

@scmcca I would like to have the reasoning why route_id + trip_id would be bad. I could understand it is redundant.

@gcamp
Copy link
Contributor

gcamp commented Sep 27, 2021

However I can see a case for changing this to define that *_trip_id overrides *_route_id when both are defined, as this is what is defined in the original Google proposal:

To me that seems like implementation detail from Google, not something the spec should care. Agree with @skinkie that redundant information shouldn't necessary be forbidden. I don't think we want to forbid providing stop_id and stop_sequence for example.

@scmcca
Copy link
Contributor Author

scmcca commented Sep 27, 2021

Can I suggest we make all the added files optional with a note that:

If *_trip_id and *_route_id are defined, *_trip_id will take precedence.

@brodyFlannigan
Copy link

brodyFlannigan commented Sep 27, 2021

+1 Brody Flannigan Transit Data on behalf of Transcollines.

You can add Transcollines as a producer, we have been using those fields for quite some time. Details on our GTFS repo.

@scmcca
Copy link
Contributor Author

scmcca commented Sep 27, 2021

@gcamp Good point. I updated the PR with the proposed language above.

@skinkie Does this LGTY?

@skinkie
Copy link
Contributor

skinkie commented Sep 27, 2021

@scmcca i think the documentation is still a bit problematic.

"If both from_trip_id and from_route_id are defined, the trip_id must belong to the route_id. In this case, from_trip_id will take precedence."

I think this makes sense for the route_id definition. But I rather have it rephrased for the second (lines 352/353) something like "Only the trip is considered to be referenced." (Someone with proper writing skills please propose something...)

@scmcca
Copy link
Contributor Author

scmcca commented Sep 27, 2021

@skinkie That text is duplicated in the trip fields to reiterate the interaction between *_route_id and *_trip_id. Personally I think this is good for consistency and comprehension, but maybe that is not the case.

I'd be happy to change it, but can you help me out by defining what is problematic about having that text reiterated in the trip fields?

@skinkie
Copy link
Contributor

skinkie commented Sep 27, 2021

I think the emphasis with the trip_id fields should be on the combination of fields referring to a trip.

@scmcca
Copy link
Contributor Author

scmcca commented Sep 27, 2021

@skinkie

something like "Only the trip is considered to be referenced."

Is this captured with the proposed:

In this case, *_trip_id will take precedence.

@skinkie
Copy link
Contributor

skinkie commented Sep 27, 2021

@scmcca to me this gives the feeling that the route_id still may be unrelated, and not a holistic compound field.

@scmcca
Copy link
Contributor Author

scmcca commented Sep 27, 2021

@skinkie How is it a compound field? My understanding as written in the Google proposal is that if both *_trip_id and *_route_id are provided, *_route_id is ignored. I'm not sure how the compound field concept is useful for spec comprehension here. It's more intuitive to understand this simply as "if x and y are provided, use x".

@skinkie
Copy link
Contributor

skinkie commented Sep 27, 2021

I don't know how your datamodel works. But in our data model one trip has one route, similar to how trips.txt is defined. Hence x => (y,x).

@Bertware
Copy link

Bertware commented Sep 28, 2021

@scmcca I would like to have the reasoning why route_id + trip_id would be bad. I could understand it is redundant.

I would say route_id is not related to the transfer rule if trip_id is set, it could even cause misdirection if it is not sufficiently clear that one should not read route_id if trip_id is set,

If a transfer rule to a certain trip is created, then that rule says nothing about transfers to that trip's route, it only says something about transfers to that specific trip. Therefore I believe that the route_id should not be set, as it is not related to the transfer rule that is being defined.

Since it is redundant, not offering a benefit, and possibly being misleading, I believe it should not be allowed by the spec.

But I rather have it rephrased for the second (lines 352/353) something like "Only the trip is considered to be referenced." (Someone with proper writing skills please propose something...)

This would work for me too, but I do not understand why one would or should be able to add the route id if it is to be ignored? At best it has no positive or negative effects, in general it will cause files to be larger and slower to parse (albeit likely insignificant). I don't mind if files are a larger or slower to parse, but making the files bigger without conveying any extra data seems like a waste to me.

I would like a reasoning or an example as to why it must be allowed/possible. The (always correct) route id can easily be obtained by reading trips.txt. To me, it seems like a trivial thing for a producer to not set a route_id if a trip_id is provided. Am I missing some specific case case? Are there providers already creating feeds that would not be compatible with this specification?

I don't know how your datamodel works. But in our data model one trip has one route, similar to how trips.txt is defined. Hence x => (y,x).

This is why I would not include from_route_id if from_trip_id is set, or to_route_id when to_trip_id is set. This data is already present in the dataset, and adding it once more in the transfers.txt file would in my opinion achieve the same as adding agency_id to stop_times.txt.

@gcamp
Copy link
Contributor

gcamp commented Sep 28, 2021

@Bertware I understand the reasoning but for debugging and for data producers, having the route next to the trip can be useful to give context on the trip.

@Bertware
Copy link

Bertware commented Sep 28, 2021

Small editorial suggestion that doesn't affect the spec: Replacing "in this case" with ", and" makes it more clear that to_trip_id always takes precedence. Just to make it clear that it doesn't just take precedence in the case of trip_id and route_id matching.

If both to_trip_id and to_route_id are defined, the trip_id must belong to the route_id. In this case, to_trip_id will take precedence.

Suggestion:

If both to_trip_id and to_route_id are defined, the trip_id must belong to the route_id, and to_trip_id will take precedence.

Same for from_trip_id and from_route_id.

@scmcca
Copy link
Contributor Author

scmcca commented Sep 29, 2021

@Bertware I applied the rewording you suggested in the previous comment.

As for:

Therefore I believe that the route_id should not be set, as it is not related to the transfer rule that is being defined.

Though I tend to agree that this is better, this was not how the original Google proposal was written. Therefore, those who have implemented this extension already have instances where a route_id is defined "redundantly". Forbidding it here would force historical implementations to change, which makes this a tough sell IMHO.

@Bertware
Copy link

Bertware commented Sep 30, 2021

@scmcca thanks for the example/reasoning, other than this minor remark it looks good to us.

+1 From Samtrafiken/Trafiklab

@skinkie
Copy link
Contributor

skinkie commented Sep 30, 2021

+1 OpenGeo

@gcamp
Copy link
Contributor

gcamp commented Sep 30, 2021

+1 Transit

@paulswartz
Copy link
Contributor

+1 @mbta

@flocsy
Copy link
Contributor

flocsy commented Oct 5, 2021

+1 Moovit

@scmcca
Copy link
Contributor Author

scmcca commented Oct 5, 2021

The vote ended on 2021-10-04 at 23:59:59 UTC with 5 votes in favor and 0 opposed. As per the Specification Amendment Process, this vote passes!

Thanks to all involved.

@scmcca scmcca removed the Status: Voting Pull Requests where the advocate has called for a vote as described in the changes.md label Oct 5, 2021
@scmcca scmcca merged commit 4b00b1b into google:master Oct 5, 2021
@skinkie
Copy link
Contributor

skinkie commented Oct 5, 2021

@scmcca thanks :-) So now back to our primary key story.

@Bertware
Copy link

Bertware commented Oct 5, 2021

Thanks @scmcca for leading this PR!

@BodoMinea
Copy link

Is my understanding correct in regards to the fact that this PR has nothing to do with in-seat transfer but makes reference specifically to trips/routes waiting for each other at transfer points?

I am still on my „quest” to find a way of representing my weird partial-trip-merge-in-seat-transfer that a few of my partner operators have in their system as part of daily operations:

image
(trying to use block_id in this case leads to an obvious Trip B and trip C both are in the same block and have overlapping arrival times.)

But I guess a different PR/proposal will bring this to the community's attention.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GTFS Schedule Issues and Pull Requests that focus on GTFS Schedule
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants