Executive Summary

Objective

To provide a standardized XML format that can be used to represent flat track roller derby bout information, currently stored in the “stat-book” spreadsheet, which can then be easily parsed by a variety of tools for anything from website generation, year end awards, player feedback, coaching information, etc...

Goals

Define a schema that captures everything that is currently captured by the stat-book, with additional optional information that can capture:

1. Time stamps on events (from potential future real-time data collection tools, as well as gathered from “Bout Time” paperwork)

2.Unique skater identification, to allow skaters who change names (or skate under different names on home vs travel teams)

3.Support rule variations (such as “20 minute periods, 5 penalty foul-outs” common amongst home team play), as well as different versions of rules (including older “no-minors”), and potential future versions (such as “30 second penalties”).

4.Allow for recording additional staff information, including bench coaches and officials, to allow for tracking participants other than just skaters

5.Import/Export to the stat-book

6.Become an standard interchange mechanism between different roller derby tools

7.History sensitive - skaters will transfer between teams/leagues, teams will come and go, skaters will change names. This schema needs to be able to handle that information

8.Make it easy to work with the simple and common place, allow for more complex exceptional handling for other (for example, an individual that is both a skater for one league, but an official for another league)

9.Support both formal “Sanctioned” bouts, “home bouts”, or even “mixers” (with skaters composed from multiple leagues), or practice scrimmages

10.Easily human readable, even to the point where a human could create a file manually.

Solution

XML has become a de-facto approach to storing data that is shared between systems. Flat Track Derby XML (DerbyXML) is an attempt to solve all these problems. DerbyXML files (with the “.derbyxml” suffix) can store a variety of different data types (allowing for separate files for each league, or for each bout, etc...) or can have nested information. XML data can also be much smaller than a full spreadsheet, and be easier to read/write, saving on processing power as well as storage space. By being extensible, additional attributes/elements can be added to support future needs (for example, generation of league web-pages, tools for managing officials evals, or other league/roster management).

DerbyXML documents are broken down into several different kinds of XML elements:

1. Static data element - such as the venue, skaters, officials. All of this information should be determined before the game starts and should not be edited during game-time. These elements are “pre-populated”

2. Event data element - this is a record of all things that happen during the game, from starting whistle to final whistle (or end of official review at the end of the game). This is the data that is synchronized in real time.

3. Organizational element - the team, period, and jam elements. These enumerated elements are assumed to automatically exist, and will be created when referenced if need be. Thus, no specific node needs to create a new jam or the next period.

4. (Optional) Auditing elements used to handle collaboration and recovery. This is a record of the event data creation/modification messages and typically need not be saved with the DerbyXML file (but need to be recorded as part of the game time synchronization algorithm).

The core “top level” information includes:

Rule Set

Defines a rule set, with options for the various game parameters.

Venues

Defines a location to play roller derby

Skater

Defines a singe participant. Normally this is a player, but skater can also be an official (skating ref or NSO), bench staff, volunteer, announcer, or any other individual involved in the sport (despite using the element name “skater”).

Team

Teams (including “officials teams”) are composed of skaters. Teams can either be persistent teams, or ad-hoc teams (such as used for “mixers”). Persistent teams also change over time (as skaters get drafted/retired from season to season). Teams can also include rostered skaters

League

A league can be a member of WFTDA/MRDA, an apprentice league, or a league just starting out. Leagues can include skaters as well as teams.

Bout

Bout data corresponds to a single stat-book spreadsheet. It includes:

1.Venue & time/date information

2.Rules set

3.Rosters for both “home” and “away” teams, as well officials

3.1.Team identification (league & team name)

3.1.1.Skaters

3.2.Officials and their positions

4.Periods

4.1.Jams

4.1.1.Lineups

4.1.2.Events (scoring, penalties, penalty box records, etc...)

4.2.Timeouts/other stoppages of the clock (including official review information)

4.3.Action/Error summary

Attributes vs Elements + text

XML schemas can be designed to store simple unique properties (i.e., non-arbitrary arrays of other objects) in either an attribute of the node, or as the text content of the child element of that node. There is much debate about which is the best approach, but it often boils down to case specific situations. For XML markup (such as (X)HTML, DocBook, etc..) it is obvious that the textual content is in child elements. However, we want this to be readable (and even writeable) by humans. So compare:

<jam>
<number>1</number>
<timestamp>0:00</timestamp>
<duration>0:42</duration>
<eid>97F2B5D3-C516-4325-BC9C-DC20FF87DD9F</eid>
<line-up>
<team>Home</team>
<skater>123</skater>
<position>jammer</position>
<timestamp>123123</timestamp>
<eid>81D22755-9C45-4229-9E7C-98F8BF7D4D94</eid>
<line-up>
...
<jam>
vs:

<jam number="1" timestamp="0:00" duration="0:42" eid="97F2B5D3-C516-4325-BC9C-DC20FF87DD9F">
<line-up team=”Home” skater=”123” position=”jammer” timestamp=”123123” eid=”81D22755-9C45-4229-9E7C-98F8BF7D4D94”/>
...
</jam>
and it become clear that the second approach is much more compact and readable. So our rule of design is that all simple unique properties of an objects are stored as their string value attribute of the node. Any complex (object) property is a child element. Arbitrary textual comments are stored as text element of the node (inside note child elements). One exception is that objects that have only a single string property has that store value stored as the text of that element. For example, the uuid element has the actual uuid stored as the text, as opposed to an attribute - given that the uuid element is used when the parent has a list of uuids, we need separate elements anyway and thus can’t just fold those values into a single attribute of the parent node (yes, we could use an xml “attribute list” format, but we’d rather not, especially since some of the data that we will treat this way already includes commas).

References and Unique IDs

One of the big problems is how to handle skaters with multiple names that skate for multiple teams (or may be an official for another league, etc...) A similar problem crops up with venue names - there are a number of locations that have the same name (but are in different states). It may also happen that data collection tools end up assigning different unique ids to the same bout as well. Related to this is how to refer to these “not uniquely named” things.

Note that currently, this applies only to venues, persons, and bouts. It is assumed that full league names are unique, and that teams are unique within the league (or, in the case of non-league teams, such as “Team USA”, unique period)

First, all such elements include zero or more “uuid” child elements (or a single “uuid” property). Having multiple uuid’s may seem counter-intuitive, but it will make it easier to combine data sets (so a skater in one dataset can have a different uuid from the skater in another dataset and those two uuids can then be “merged”). UUIDs are, however, designed to be unique, and with sufficient entropy so as to not create a collision.

Second, within a document, a skater is referenced via a local “id” attribute (unique within the required scope). The suggested format for bout is “team abbreviation_skater number”, but that is not required. This can be used as a “skater” reference attribute when needed (for example, in a “lineup” element, or “event-penalty” element). Note that skater references can be in other forms as well, if “skater” is not specified. These include:

team” attribute (if needed) + “number” attribute. If the number is unique within the scope of the reference (only a single “123” skater on either roster) or the team is already specified by an enclosing element, the “team” attribute can be ignored. There is also a special “number” attribute “???” (three question marks) that can be used to indicate an unknown skater (e.g., when the lineup track misses one of the skaters)

name” attribute (assuming the skater’s name is unique within the scope of the reference)

The term “unique within the scope of the reference” means that, if used nested within a bout element that it is unique within that single bout element, even though multiple bout elements may exist int the document. There is also special “skater” reference value of “-” (dash) to indicate “no skater” (e.g., to indicate that a team skated short in a jam in the lineup).

“Well Formed”

In order to be able to effectively merge the XML tree from multiple sources in a distributed manner, some restrictions on the XML need to be in place:

1.Ordered elements must be properly numbered with number attributes, and be in the appropriate order. This applies to period, jam elements. pass elements, since they are events that are mixed with other events in a jam (as well has being independent between the two teams) should be stored in time event order. However, pass elements should still be ordered with respect to other pass elements from that team (i.e., pass number=1 should occur before pass number=2).

2.Event elements within a jam should be stored in time order (if available).

3.Default “blank” bout must include three team elements, and two period elements (with appropriate number attributes):
<bout>
<team id="home"/>
<team id="away"/>
<team id="officials"/>
<period number="1"/>
<period number="2"/>
</bout>
(It is unclear if there must be a default id attribute for team elements)

4.A person id attribute must obviously be unique in the document, and is suggested to be “teamName_skaterNumber” (though “Home” and “Away” are sufficient for the teams). If the skater is created without a number (for example, a bench coach), some other unique, random number can be used. If the skater’s number is changed for whatever reason (such as at gear check), changing the id attribute is risky, and only should be done if no skater events exist that refer to those ids.

5.Events that have eid attributes can never change that attribute.

Event Based

The model for jams is one based on a series of events, designed to mimic the action on the track. This is as opposed to either factual summary (“this action occurred 5 times in the period”) or a “spread sheet” model (which separates the events of each team from the other). The goal is to allow for the interleaving of data as it is collected (if possible), and as such, all events have an optional timestamp element. When data is imported from a stat book worksheet, very little temporal information is provided (unless the bout timer paperwork is filled out, and some additional timing information can be derived from the penalty box paperwork when the box trip record spans across multiple jams).

Timestamps are relative or absolute, and can be specified in a variety of formats:

hh:mm:ssAn absolute time, as specified by a 24 wall clock
sA time, in seconds, relative to the start of the period (for all elements contained within a period element), or the start of the “epoch” (for period elements).
+s, -sA time, in seconds, relative to the start or end of the jam, respectively (for all elements contained within a jam element). For example, this can be used for recording when the skater sat in the box based on the penalty box paperwork
m:ssA time corresponding on the current period clock. This does not require that the period clock counts in any specific order - the direction of the period clock is specified in the ruleset element (which defaults to counting down from the period duration if not specified). NB: This needs to factor in timeouts and other stoppages of the period clock to property convert to an absolute time.

Time stamps can also be estimated by prefixing the time with a “~” (tilde), or specifying a range of time separated by a “~” (tilde). This can occur if, for example, we know that Skater 123 got a penalty in period 3, sat in the penalty box, and was still there when the jam ended after its natural conclusion and the penalty box timer stopwatch said “0:15”. So from this, we can deduce that the skater sat 15 seconds before the end of the jam (this assumes we have bout times for an absolute value), so the enter-box element timestamp would be “-15”. Since the jam was 2:00 in duration, we know that the penalty occurred sometime between the start of the jam and when the skater sat. So that penalty element timestamp would be “+0~-15”, “+0~+105”, or even “~+95” (ten seconds before the skater sat, estimating that the time it took between issuing the penalty and the skater actually reaching the box and sitting being ten seconds).

Estimated times like these are designed to always be derived based on some sort of algorithm. An example algorithm would take the bout clock paperwork and combine it with the penalty box paperwork to get times as per above. Note that all algorithms should produce the same time range estimate value, but specific times estimates may vary. As a result, any application is free to ignore the estimated timestamps and provide its own.