Synthetic Biology Open Language (SBOL) Version 2.3

Abstract Synthetic biology builds upon the techniques and successes of genetics, molecular biology, and metabolic engineering by applying engineering principles to the design of biological systems. The field still faces substantial challenges, including long development times, high rates of failure, and poor reproducibility. One method to ameliorate these problems is to improve the exchange of information about designed systems between laboratories. The synthetic biology open language (SBOL) has been developed as a standard to support the specification and exchange of biological design information in synthetic biology, filling a need not satisfied by other pre-existing standards. This document details version 2.3.0 of SBOL, which builds upon version 2.2.0 published in last year’s JIB Standards in Systems Biology special issue. In particular, SBOL 2.3.0 includes means of succinctly representing sequence modifications, such as insertion, deletion, and replacement, an extension to support organization and attachment of experimental data derived from designs, and an extension for describing numerical parameters of design elements. The new version also includes specifying types of synthetic biology activities, unambiguous locations for sequences with multiple encodings, refinement of a number of validation rules, improved figures and examples, and clarification on a number of issues related to the use of external ontology terms.


Purpose
1 Synthetic biology builds upon the techniques and successes of genetics, molecular biology, and metabolic engineer-2 ing by applying engineering principles to the design of biological systems. These principles include standardization, 3 modularity, and design abstraction. The field still faces substantial challenges, including long development times, 4 high rates of failure, and poor reproducibility. A common factor of these challenges is the exchange of information 5 about designed systems between laboratories. When designing a synthetic system, synthetic biologists need to 6 exchange information about multiple types of molecules and their expected behavior in the design. Furthermore, 7 there are often multiple degrees of separation between a specified nucleic acid sequence (e.g., a sequence that 8 encodes an enzyme or transcription factor) and the molecular interactions that a designer intends to result from 9 said sequence (e.g., chemical modification of metabolites or regulation of gene expression), yet these different 10 perspectives need to be connected together in the engineering of biological systems. 11 The Synthetic Biology Open Language (SBOL) has been developed as a standard to support the specification and 12 exchange of biological design information in synthetic biology, filling a need not satisfied by other pre-existing 13 standards. Previous nucleic acid sequence description formats lack key capabilities. For example, simple sequence 14 encoding formats such as FASTA encode almost nothing about design rationale. More sophisticated formats such 15 as GenBank and Swiss-Prot support a flat annotation of sequence features that is well suited to the description 16 of natural systems, but is unable to represent the multi-layered design structure common to engineered systems.
17 Figure 1 shows the relationship of selected prior sequence description formats to SBOL 1.x and SBOL 2.x. Modeling 18 languages, such as the Systems Biology Markup Language (SBML) Hucka et al. (2003) can be used represent 19 biological processes, but are not sufficient to represent the associated nucleotide or amino acid sequences. Synthetic 20 biology needs a structured standard that defines how to represent relevant molecules and their functional roles 21 within a designed system, standardized rules on how such information is encoded in a file format, and software 22 libraries to enable the exchange of such data between participating laboratories and as part of the publication 23 process. 24 To help address these challenges, SBOL introduces a standardized format for the electronic exchange of information    The Computational Modeling in Biology Network (COMBINE) holds regular workshops where synthetic biologists 2 and systems biologists can work toward a common goal of integrating biological knowledge through inter-operable 3 and non-overlapping data standards. In April of 2014, several SBOL Developers attended a COMBINE workshop 4 and then proposed that SBOL join this larger standards community. The proposal passed and SBOL workshops 5 have been co-located with COMBINE meetings since the 11th workshop at the University of Southern California in 6 August 2014. 7 Current development of this SBOL 2.x specification is funded in large part by a grant from the National Science This document indicates requirement levels using the controlled vocabulary specified in IETF RFC 2119 and 3 reiterated in BBF RFC 0. In particular, the key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 4 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted 5 as described in RFC 2119. 6 ■ The words "MUST", "REQUIRED", or "SHALL" mean that the item is an absolute requirement. 7 ■ The phrases "MUST NOT" or "SHALL NOT" mean that the item is an absolute prohibition. 8 ■ The word "SHOULD" or the adjective "RECOMMENDED" mean that there might exist valid reasons in 9 particular circumstances to ignore a particular item, but the full implications need to be understood and 10 carefully weighed before choosing a different course. 11 ■ The phrases "SHOULD NOT" or "NOT RECOMMENDED" mean that there might exist valid reasons in 12 particular circumstances when the particular behavior is acceptable or even useful, but the full implications 13 needs to be understood and the case carefully weighed before implementing any behavior described with this 14 label.
15 ■ The word "MAY" or the adjective "OPTIONAL" mean that an item is truly optional. 16

17
SBOL defines the following "top-level" and dependent classes: 18 Collection: Represents a user-defined container for organizing a group of SBOL objects.

19
ComponentDefinition: Describes the structure of designed entities, such as DNA, RNA, and proteins, as well as 20 other entities they interact with, such as small molecules or environmental properties.  GenericTopLevel: Represents a data container that can contain custom data added by user applications. 33 Model: Links to quantitative or qualitative computational models that might be used to predict the functional 34 behavior of a biological design.

Section 5.2 SBOL Class Names
the design of a system. Because the same definition might appear in multiple designs or multiple times 1 in a single design, a single ComponentDefinition can have zero or more parent ModuleDefinitions, 2 and each such parent-child link requires its own, distinct FunctionalComponent. 3 ■ Interaction: Describes a functional relationship between biological entities, such as regulatory activa-4 tion or repression, or a biological process such as transcription or translation. 5 ■ MapsTo: When a design (ComponentDefinition or ModuleDefinition) includes another design as a 6 sub-design, the parent design might need to refer to a ComponentInstance (either a Component or 7 FunctionalComponent) in the sub-design. In this case, a MapsTo needs to be added to the instance for 8 the sub-design, and this MapsTo needs to link between the ComponentInstance in the sub-design and a 9 ComponentInstance in the parent design. 6 Overview of SBOL 1 Synthetic biology designs can be described using: 2 ■ Structural terms, e.g., a set of annotated sequences or information about the chemical makeup of components. 3 ■ Functional terms, e.g., the way that components might interact with each other and the overall behavior of a 4 design. 5 In broad strokes, the prior SBOL 1.1 standard focused on conveying physical, structural information, whereas SBOL 6 2 expands the scope to include functional aspects as well. The physical information about a designed genetic 7 construct includes the order of its constituents and their descriptions. Specifying the exact locations of these 8 constituents and their sequences allow genetic constructs to be defined unambiguously and reused in other designs. 9 SBOL 2 extends SBOL 1.1 in several ways: it extends physical descriptions to include entities beyond DNA sequences, 10 and it supports functional descriptions of designs. 11 As an example, consider the design of an expression cassette, such as the one found in the plasmid pUC18 Norrander 12 et al. (1983). This device is designed to detect successful versus unsuccessful molecular cloning. As an overall 13 system, the device is designed to grow either blue-colored (unsuccessful) or white-colored (successful) colonies in 14 the presence of IPTG and the chemical X-gal. Internally, the device has a number of parts, including a promoter, the 15 lac repressor binding site, and the lacZ coding sequence. These parts have specific component-level interactions 16 with IPTG and X-gal, as well as native host gene products, transcriptional machinery and translational machinery 17 that collectively cause the desired system-level behavior. 18 Knowledge of how such a device functions within the context of a host and how it might be adapted to new 19 experimental applications has generally been passed on through working with fellow scientists or reading articles 20 in papers and books. But there has been no systematic way to communicate the integration of sequences with 21 functional designs, so users typically have had to look in many different places to develop an understanding of a 22 system. The SBOL 2 standard allows designers to describe these functional characteristics and connect them to the 23 physical parts and sequences that make up the design.

24
SBOL 2 includes two main classes that match the structural/functional distinction above: 25 ■ The ComponentDefinition object describes the physical aspects of the designed system, such as its DNA or 26 RNA sequences, and the physical relationships among sub-components, as when one sequence contains 27 another as a sub-sequence. 28 ■ The ModuleDefinition object describes interactions of the designed system, such as specific binding rela-29 tionships and repression and activation relationships. Whereas Figure 2 provides a broad overview of SBOL, Figure 3 provides a detailed, implementation-level overview of 1 the class structure for the SBOL 2.x data model. This figure relies on the semantics of the Unified Modeling Language 2 (UML), which will be presented in more detail in the next section. Figure 3 distinguishes between top level classes, in 3 green, and other supporting classes (note that Figure 2 also includes all of the top level classes). In Figure 3, dashed 4 arcs represent "refersTo", whereas a solid arrow represents ownership. In UML, the meaning of ownership is that if a 5 parent class is deleted, so are all of its owned children. Thus, a Collection does not own its ComponentDefinition 6 objects, because these can stand on their own. All of the supporting classes (in orange) have to be owned by some 7 top-level class, directly or indirectly.  ways incorporate that object by reference. We do not directly incorporate it by copy, because when an object 10 is used many times, keeping many copies becomes spatially inefficient and difficult to maintain. Instead, each 11 Section 6 Overview of SBOL Page 14 of 149 x provides a few helper classes. Location generalizes the positioning information from SBOL 1.1 to 8 allow discontinuous ranges and cuts to be annotated. SequenceConstraint generalizes the relative positioning 9 information among Components. There are also Participations, which allow Interaction objects to specify 10 the roles of their participants while referencing the FunctionalComponents, so that these can stand on their own. 11 Additionally, there is the MapsTo class (not shown), which enables connections to be made between Components 12 and FunctionalComponents across various levels of the design hierarchy. The next section provides complete 13 definitions and details for all of these classes.
14 There is one final, critical element of SBOL 2: its extension mechanism. This extension mechanism enables the 15 storage of application specific information within an SBOL document. It is also intended to support the prototyping 16 of data representations whose format is not yet a matter of consensus within the community. In particular, each 17 SBOL entity can be annotated using the Resource Description Framework (RDF). Moreover, application specific 18 entities in the form of RDF documents can be included as GenericTopLevel entities. SBOL libraries make these 19 annotations and entities available to tools as generic properties and objects that are preserved during subsequent 20 read and write operations.  1 In this section, we describe the types of biological design data that can belong to an SBOL document and the 2 relationships between these data types. The SBOL data model is specified using Unified Modeling Language (UML) 3 2.0 diagrams (OMG 2005). Subsections Section 7.1, Section 7.2, Section 7.3 review the basics of UML diagrams and 4 explain the naming conventions and generic data types used in this specification. The remaining sections then 5 describe the SBOL data model in detail.

SBOL Data Model
Complete SBOL examples and best practices when using the standard can 6 be found in Section 9 and Section 12, respectively. 7 7.1 Understanding the UML Diagrams 8 The types of biological design data modeled by SBOL are commonly referred to as classes, especially when discussing 9 the details of software implementation. Each SBOL class can be instantiated by many SBOL objects. These objects 10 MAY contain data that differ in content, but they MUST agree on the type and form of their data as dictated by their 11 common class. Classes are represented in UML diagrams as rectangles labeled at the top with class names.

12
Classes can be connected to other classes by association properties, which are represented in UML diagrams as 13 arrows. These arrows are labeled with data cardinalities in order to indicate how many values a given association 14 property can possess (see below). The remaining (non-association) properties of a class are listed below its name.

15
Each of the latter properties is labeled with its data type and cardinality. 16 In the case of an association property, the class from which the arrow originates is the owner of the association 17 property. A diamond at the origin of the arrow indicates the type of association. Open-faced diamonds indicate 18 shared aggregation, in which the owner of the association property exists independently of its value. In the SBOL 19 data model, the value of an association property MUST be a URI or set of URIs that refer to SBOL objects belonging 20 to the class at the tip of the arrow.

21
By contrast, filled diamonds indicate composite aggregation, also known as a part-whole relationship, in which the 22 value of the association property MUST NOT exist independently of its owner. In addition, in the SBOL data model, 23 it is REQUIRED that the value of each composite aggregation property is a unique SBOL object (that is, not the value 24 for more than one such property). Note that in all cases, composite aggregation is used in such a way that there 25 SHOULD NOT be duplication of such objects. 26 All SBOL properties are labeled with one of several restrictions on data cardinality. These are: 27 ■ 1 -REQUIRED, one: there MUST be exactly one value for this property.

Section 7.3 Data Types
after the first begin with an uppercase letter (e.g., persistentIdentity). 1 Within the SBOL data model, each property is given a singular or plural name in accordance with its data cardinalities. 2 The forms of these names follow the usual rules of English grammar. For example, sequenceAnnotation is the 3 singular form of sequenceAnnotations. 4 SBOL properties are always given singular names, however, when SBOL objects are serialized (using Resource 5 Description Framework (RDF) as described in Section 10). This is because the SBOL data model does not contain 6 classes that correspond directly to the RDF elements that group other elements into ordered or unordered sets. 7 Consequently, if an SBOL property has multiple values, then it is serialized as multiple property entries, each with a 8 singular name and a single value. For example, if an SBOL property has five values, then its serialization contains 9 five RDF triples, each with a singular predicate name and one of the five values as its object. When SBOL use simple "primitive" data types such as Strings or Integers, these are defined as the following 12 specific formal types: The term literal is used to denote an object that can be any of the four types listed above. In addition to the 24 simple types listed above, SBOL also uses objects with types Uniform Resource Identifier (URI) and XML Qualified It is important to realize that in RDF, a URI might or might not be a resolvable URL (web address). A URI is always a 33 globally unique identifier within a structured namespace. In some cases, that name is also a reference to (or within) 34 a document, and in some cases that document can also be retrieved (e.g., using a web browser).

Section 7.4 Identified
As shown in Figure 4, the Identified class includes the following properties: identity, persistentIdentity, 1 version, wasDerivedFroms , name, description, and annotations. The latter property is described separately in 2.2.0 2 Section 7.16. 3 When an SBOL resource reference takes the form of a URI, that URI can either be the value of an identity property 4 or the value of a persistentIdentity property. If the URI is equal to the value of an identity property, then it is 5 guaranteed to be unique, and it refers to precisely one SBOL object with that URI. If the URI is equal to the value of 6 a persistentIdentity property, then it MAY refer to multiple SBOL objects that are different "versions" of each 7 other. These objects SHOULD be compared to one another to determine which single object the URI resolves to 8 (normally the most recent version -see Section 7.4). Throughout this document, when a URI is used to refer to an 9 SBOL object, it could fall into either of these cases. The identity property 11 The identity property is REQUIRED by all Identified objects and has a data type of URI. A given Identified 12 object's identity URI MUST be globally unique among all other identity URIs. It is also highly RECOMMENDED 13 that the URI structure follows the recommended best practices for compliant URIs specified in Section 12.3.
14 Although most SBOL properties are defined by SBOL and serialized with its namespace, the identity property is 15 defined by the analogous RDF about property and is serialized with the RDF namespace as follows: 16 http://www.w3.org/1999/02/22-rdf-syntax-ns#about. 17 The use of about is expressly for the purpose of making SBOL compliant with pre-existing standards: when you see 18 about in an SBOL document, you SHOULD interpret it as meaning identity.

19
The persistentIdentity property 20 The persistentIdentity property is OPTIONAL and has a data type of URI. This URI serves to uniquely refer to a 21 set of SBOL objects of the same class that are different versions of each other.
2.0.1 22 An Identified object MUST be referred to using either its identity URI or its persistentIdentity URI.

23
The displayId property 24 The displayId property is an OPTIONAL identifier with a data type of String. This property is intended to be an 25 intermediate between name and identity that is machine-readable, but more human-readable than the full URI of 26 an identity.

Section 7.4 Identified
The version property 1 The version property is OPTIONAL and has a data type of String. This property can be used to compare two SBOL 2 objects with the same persistentIdentity. 3 If the version property is used, then it is RECOMMENDED that version numbering follow the conventions of se-4 mantic versioning (http://semver.org/), particularly as implemented by Maven (http://maven.apache.org/). 5 This convention represents versions as sequences of numbers and qualifiers that are separated by the characters "." 6 and "-" and are compared in lexicographical order (for example, 1 < 1.3.1 < 2.0-beta). For a full explanation, see the 7 linked resources. 8 The wasDerivedFroms property 9 2.2.0 The wasDerivedFroms property is OPTIONAL and MAY specify a set of URIs. An SBOL object with this property 10 refers to one or more SBOL objects or non-SBOL resources from which this object was derived. 11 2.0.1 The wasDerivedFroms property of a TopLevel SBOL object is subject to the following rules. If any mem-12 bers of the wasDerivedFroms property of an SBOL object A that refers to an SBOL object B has an identical 13 persistentIdentity, and both A and B have a version, then the version of B MUST precede that of A. In 14 addition, an SBOL object MUST NOT refer to itself via its own wasDerivedFroms property or form a cyclical chain 15 of references via its wasDerivedFroms property and those of other SBOL objects. For example, the reference chain 16 "A was derived from B and B was derived from A" is cyclical.

17
The name property 18 The name property is OPTIONAL and has a data type of String. This property is intended to be displayed to a 19 human when visualizing an Identified object.

20
If an Identified object lacks a name, then software tools SHOULD instead display the object's displayId or 21 identity. It is RECOMMENDED that software tools give users the ability to switch perspectives between name 22 properties that are human-readable and displayId properties that are less human-readable, but are more likely to 23 be unique.

24
The description property 25 The description property is OPTIONAL and has a data type of String. This property is intended to contain a more 26 thorough text description of an Identified object.

27
The annotations property 28 The annotations property is OPTIONAL and MAY specify a set of Annotation objects that are contained by the 29 Identified object. The Annotation class is described in more detail in Section Section 7.16.1.

31
No complete serialization is defined for Identified, since this class is only used indirectly through its child classes.

32
Any such child class, however, has the following form for serializing properties inherited from Identified, where 33 CLASS_NAME is replaced by the name of the class: Collection, GenericTopLevel, CombinatorialDerivation, and Implementation( Figure 5). The attachments property 19 The attachments property is OPTIONAL and MAY specify a set of Attachment objects that are referenced by the 20 TopLevel object. The Attachment class is described in more detail in Section Section 7.13. No serialization is defined for TopLevel, since this class has no properties of its own and is only used indirectly 2 through its child classes. All TopLevel classes are serialized one level beneath the RDF document root. The purpose of the Sequence class is to represent the primary structure of a ComponentDefinition object and 5 the manner in which it is encoded. This representation is accomplished by means of the elements property and 6 encoding property ( Figure 6). The elements property 8 The elements property is a REQUIRED String of characters that represents the constituents of a biological or 9 chemical molecule. For example, these characters could represent the nucleotide bases of a molecule of DNA, the 10 amino acid residues of a protein, or the atoms and chemical bonds of a small molecule.

11
The encoding property 12 The encoding property is REQUIRED and has a data type of URI. This property MUST indicate how the elements 13 property of a Sequence MUST be formed and interpreted.
14 For example, the elements property of a Sequence with an IUPAC DNA encoding property MUST contain characters 15 that represent nucleotide bases, such as a, t, c, and g. The elements property of a Sequence with a Simplified 16 Molecular-Input Line-Entry System (SMILES) encoding, on the other hand, MUST contain characters that 17 represent atoms and chemical bonds, such as C, N, O, and =.
18 Table 1 provides a list of possible URI values for the encoding property. The terms in Table 1 are organized by 19 the type of ComponentDefinition (see Table 2) that typically refer to a Sequence with such an encoding.  Table 2) that typically refer to a Sequence with such an encoding. The example below shows the serialization of the Sequence for a promoter. The nucleotide bases of the Sequence 10 are serialized as the String value of its elements property, while its IUPAC DNA encoding is serialized as the URI 11 value of its encoding property. The ComponentDefinition class represents the structural entities of a biological design. The primary usage of this 27 class is to represent structural entities with designed sequences, such as DNA, RNA, and proteins, but it can also be 28 used to represent any other entity that is part of a design, such as small molecules, molecular complexes, and light.

29
As shown in Figure 7, the ComponentDefinition class describes a structural design entity using the following 30 properties: types, roles, and sequences. In addition, this class has properties for describing and organizing the 31 substructure of said design entity, including components, sequenceAnnotations, and sequenceConstraints. types property of a ComponentDefinition SHOULD contain a URI from Table 2, and any ComponentDefinition 10 that can be well-described by one of the terms in Table 2 MUST use the URI for that term as one of its types. Finally, 11 if the types property contains multiple URIs, then they MUST identify non-conflicting terms (otherwise, it might 12 not be clear how to interpret them). For example, the BioPAX terms provided by Table 2   Any ComponentDefinition classified as DNA (see Table 2) is RECOMMENDED to encode circular/linear topology 25 information in an additional type field. This (topology) type field SHOULD specify a URI from the Topology Attribute 26 branch of the SO (this is currently just 'linear' or 'circular' as given in Table 3). Topology information SHOULD be 27 specified for DNA ComponentDefinition records with a fully specified sequence, except in three scenarios: if the 28 DNA record doesn't have sequence information, or if the DNA record has incomplete sequence information, or if 29 topology is genuinely unknown. For any ComponentDefinition classified as RNA (see Table 2), a topology type 30 field is OPTIONAL. The default assumption in this case is linear topology. In any case, no more than one topology 31 should be specified.

32
Any ComponentDefinition classified as DNA or RNA MAY also have strand information encoded in an additional 33 (third) type field using a URI from the Strand Attribute branch of the SO (currently there are only two possible terms 34 for single or double-stranded nucleic acids given in Table 3  The roles property 6 The roles property is an OPTIONAL set of URIs that clarifies the potential function of the entity represented by a 7 ComponentDefinition in a biochemical or physical context. 8 The roles property of a ComponentDefinition MAY contain one or more URIs that MUST identify terms from 9 ontologies that are consistent with the types property of the ComponentDefinition. (GO:0003674) of the Gene Ontology (GO) and the role branch (CHEBI:50906) of the CHEBI ontology. Table 4 15 contains a list of possible ontology terms for the roles property and their URIs. These terms are organized by the 16 type of ComponentDefinition to which they SHOULD apply (see Table 2). Any ComponentDefinition that can be 17 well-described by one of the terms in Table 4 MUST use the URI for that term as one of its roles.   Table 4: Ontology terms to specify the roles property of a ComponentDefinition, organized by the type of ComponentDefinition to which they are intended to apply (see Table 2).
The sequences property 30 The sequences property is OPTIONAL and MAY include a set of URIs that refer to Sequence objects. These objects 31 define the primary structure of the ComponentDefinition.

32
Many ComponentDefinition objects will refer to precisely one Sequence object. For certain use cases, however, it 33 can be appropriate to refer to multiple Sequence objects. For example, a user might wish to provide two different 34 representations of the structure of a DNA ComponentDefinition, one that represents its structure at the level of 35 nucleotide bases and one that represents its structure at the level of atoms and bonds.

36
If a ComponentDefinition refers to more than one Sequence object, then these objects MUST  because it specifies how to encode biochemical entities in general, which includes DNA, RNA, and proteins. If 3 a ComponentDefinition refers to more than one Sequence with the same encoding, then the elements of these 4 Sequence objects SHOULD have equal lengths. These requirements and best practices are intended to make it easier 5 for software tools to locate any regions specified by the SequenceAnnotation objects of a ComponentDefinition 6 on its associated Sequence objects, as well as validate whether its Sequence objects are consistent with those 7 associated with any ComponentDefinition objects that it composes via its Component objects. 8 Finally, if a ComponentDefinition refers to one or more Sequence objects and its types property refers to a term 9 from are meant to provide for some degree of consistency between the types property of a ComponentDefinition and 16 the encoding properties of the Sequence objects to which the ComponentDefinition refers.

17
The components property 18 The components property is OPTIONAL and MAY specify a set of Component objects that are contained by the 19 ComponentDefinition. The set of relations between Component and ComponentDefinition objects is strictly 20 acyclic (see Section 7.7.1).

21
While the ComponentDefinition class is analogous to a blueprint or specification sheet for a biological part, the 22 Component class represents the specific occurrence of a part within a design. Hence, this class allows a biological 23 design to include multiple instances of a particular part (defined by reference to the same ComponentDefinition).

24
For example, the ComponentDefinition of a polycistronic gene could contain two Component objects that refer to 25 the same ComponentDefinition of a CDS.

26
The components properties of ComponentDefinition objects can be used to construct a hierarchy of Component 27 and ComponentDefinition objects. If a ComponentDefinition in such a hierarchy refers to one or more Sequence 28 objects, and there exist ComponentDefinition objects lower in the hierarchy that refer to Sequence objects with 29 the same encoding, then the elements properties of these Sequence objects SHOULD be consistent with each 30 other, such that well-defined mappings exist from the "lower level" elements to the "higher level" elements in 31 accordance with their shared encoding properties. This mapping is also subject to any restrictions on the positions 32 of the Component objects in the hierarchy that are imposed by the SequenceAnnotation or SequenceConstraint 33 objects contained by the ComponentDefinition objects in the hierarchy.

34
A DNA ComponentDefinition, for example, could refer to a Sequence with an IUPAC DNA encoding and an 35 elements String of "gattaca." In turn, this ComponentDefinition could contain a Component that refers to 36 a "lower level" ComponentDefinition that also refers to a Sequence with an IUPAC DNA encoding. Consequently, 37 a consistent elements String of this "lower level" Sequence could be "gatta," or perhaps "tgta" if the Component 38 is positioned by a SequenceAnnotation that contains a Location with an orientation of "reverse complement"

39
(see Section 7.7.5). Sequence with an IUPAC DNA encoding. In order to specify the discontiguous region occupied by its CDS, this gene 5 ComponentDefinition would need a SequenceAnnotation that contains one or more Range objects, each one 6 specifying start and end positions that correspond to indices of the elements of its DNA Sequence.

7
The sequenceConstraints property 8 The sequenceConstraints property is OPTIONAL and MAY contain a set of SequenceConstraint objects. These 9 objects describe any restrictions on the relative, sequence-based positions and/or orientations of the Component 10 objects contained by the ComponentDefinition. For example, the ComponentDefinition of a gene might specify 11 that the position of its promoter Component precedes that of its CDS Component. This is particularly useful when a 12 ComponentDefinition lacks a Sequence and therefore cannot specify the precise, sequence-based positions of its 13 Component objects using SequenceAnnotation objects.

15
The serialization of a ComponentDefinition MUST have the form below. The components, sequenceConstraints, 16 sequenceAnnotations, and sequences properties of a ComponentDefinition contain or reference objects belong-17 ing to the appropriate SBOL classes as their values, while the types and roles properties contain URIs that identify 18 ontology terms as their values.

19
As shown below, each of these objects and URIs are serialized as part of an implicit set of SBOL properties with 20 singular rather then plural names. In particular, each object is serialized as an RDF/XML node nested within a 21 property, while each URI (except the identity) is serialized as an rdf:resource on a property.  The ComponentInstance abstract class is inherited by SBOL classes that represent the usage or occurrence of 9 a ComponentDefinition within a larger design (that is, another ComponentDefinition or ModuleDefinition).

10
Currently, there are two subclasses of ComponentInstance: 11 ■ The Component class is used to specify the structural usage of a ComponentDefinition inside another 12 ComponentDefinition via the components property.

13
■ The FunctionalComponent class is used to specify the functional usage of a ComponentDefinition inside a 14 ModuleDefinition via the functionalComponents property. This class is described in Section 7.9.2.

15
The definition property 16 The definition property is a REQUIRED URI that refers to the ComponentDefinition of the ComponentInstance. ModuleDefinition (one that does not contain this ComponentInstance).
4 Table 5 provides a list of REQUIRED access URIs. The value of the access property MUST be one of these URIs.  The expected purpose and function of a genetic part are described by the roles property of ComponentDefinition.

26
However, the same building block might be used for a different purpose in an actual design. In other words, purpose 27 and function are sometimes determined by context.

28
The roles property comprises an OPTIONAL set of zero or more role URIs describing the purpose or potential func-

Section 7.7 ComponentDefinition
A roleIntegration specifies the relationship between a Component instance's own set of roles and the set of 1 roles on the included sub-ComponentDefinition. 2 The roleIntegration property has a data type of URI. A Component instance with zero roles MAY OPTIONALLY 3 specify a roleIntegration. A Component instance with one or more roles MUST specify a roleIntegration from 4 http://sbols.org/v2#mergeRoles Use the union of the two sets: both the set of zero or more roles given for this Component as well as the set of zero or more roles given for the included sub-ComponentDefinition. erty allows for only a portion of a ComponentDefinition's Sequence to be included, rather than its entirety. 16 If the sourceLocations property is not set, then the whole Sequence is assumed to be included. Alternatively, if the 17 sourceLocations property is set, then the relationship between the original ComponentDefinition's Sequence 18 and the included Sequence is defined identically to the locations property on the SequenceAnnotation object. The example below shows the serialization of a Component that represents an instance of a promoter: In particular, a MapsTo object provides two pieces of information: 10 ■ An identity relationship between two ComponentInstance objects, the first contained by the "lower level" 11 definition of the ComponentInstance or Module that owns the MapsTo, and the second contained by the 12 "higher level" definition that contains the ComponentInstance or Module that owns the MapsTo. The remote 13 property of a MapsTo refers to the first "lower level" ComponentInstance, while the local property refers to 14 the second "higher level" ComponentInstance.

15
■ Instructions on how to interpret local and remote ComponentInstance objects that refer to different 16 ComponentDefinition objects (that is, non-identical objects). These are specified using the refinement 17 property of the MapsTo class. 18 To illustrate this concept, two examples are provided in Figure 10, in which the ComponentDefinition of a tran-19 scriptional unit is specified by composing two "lower level" ComponentDefinition objects.
In both examples, the 20 two "lower level" ComponentDefinition objects each contain a RBS Component that is intended to represent the 21 same design entity in the "higher level" ComponentDefinition of the transcriptional unit.

22
In order to explicitly represent the identity relationships in this example, a new RBS Component needs to be created 23 inside the "higher level" ComponentDefinition. This "higher level" Component then needs to be linked to the 24 equivalent "lower level" Component objects by means of the MapsTo class, using one MapsTo object per link. For 25 example, in order to link the "higher level" RBS Component to the "lower level" RBS Component of the promoter-RBS 26 ComponentDefinition, a MapsTo has to be created on the "higher level" promoter-RBS Component. The local 27 property of this MapsTo then has to refer to the "higher level" RBS Component, while its remote property has to refer 28 to the "lower level" RBS Component. In this way, many "lower level" Component objects can be linked together at the 29 "higher level" using as an equal number of MapsTo objects, each one referring to a different remote Component, but 30 all referring to the same local Component.

31
The same types of identity relationships can also be declared between FunctionalComponent objects contained 32 by ModuleDefinition objects, or between Component objects and FunctionalComponent objects contained by 33 ComponentDefinition objects and ModuleDefinition objects, respectively. See Section 9 and Section B for In the left-hand diagram, the two Component objects inside the promoter-RBS ComponentDefinition and RBS-CDS ComponentDefinition objects both refer to an abstract RBS ComponentDefinition that lacks a sequence (white semicircle). Through the use of MapsTo objects with refinement set to useLocal, these "lower level" ComponentDefinition objects are effectively overridden by that of the green RBS in the ComponentDefinition of the complete transcriptional unit. In the right-hand diagram, however, the two "lower level" RBS ComponentDefinition objects do not lack sequences and it is the "higher level" RBS ComponentDefinition that is abstract. In this case, one of the MapsTo objects has a useRemote refinement, resulting in the green RBS ComponentDefinition overriding that of the abstract RBS in the "higher level" ComponentDefinition.
The local property 1 This REQUIRED property has a data type of URI and is used to refer to the ComponentInstance contained by the  This REQUIRED property has a data type of URI and is used to refer to the ComponentInstance contained by the 7 "lower level" ComponentDefinition or ModuleDefinition. This remote ComponentInstance MUST be contained 8 by the ComponentDefinition or ModuleDefinition that is the definition of the ComponentInstance or Module 9 that owns the MapsTo. Lastly, the access property of the remote ComponentInstance MUST be set to "public."

10
The refinement property 11 The refinement property is REQUIRED and has a data type of URI. Each MapsTo object MUST specify the rela-12 tionship between its local and remote ComponentInstance objects using one of the REQUIRED refinement URIs 13 provided in   In the example below, a FunctionalComponent in a "higher level" ModuleDefinition of a genetic toggle switch 16 is linked to a FunctionalComponent in a "lower level" LacI inverter ModuleDefinition. The full example can be 17 found in Section 9.

Section 7.7 ComponentDefinition
The locations property 1 The locations property is a REQUIRED set of one or more Location objects that indicate which elements of a 2 Sequence are described by the SequenceAnnotation. 3 Allowing multiple Location objects on a single SequenceAnnotation is intended to enable representation of 4 discontinuous regions (for example, a Component encoded across a set of exons with interspersed introns). As such, 5 the Location objects of a single SequenceAnnotation SHOULD NOT specify overlapping regions, since it is not 6 clear what this would mean. There is no such concern with different SequenceAnnotation objects, however, which 7 can freely overlap in Location (for example, specifying overlapping linkers for sequence assembly). The component property is OPTIONAL and has a data type of URI. This URI MUST refer to a Component that is 10 contained by the same parent ComponentDefinition that contains the SequenceAnnotation. In this way, the 11 properties of the SequenceAnnotation, such as its description and locations, are associated with part of the 12 substructure of its parent ComponentDefinition. recommended ontology terms for roles is given in Table 4.  The example below shows the serialization of a SequenceAnnotation object. It specifies the region occupied by a 41 Component named BBa_F2620. The Location class is extended by the Range, Cut, and GenericLocation classes. The orientation property 11 The orientation property is OPTIONAL and has a data type of URI. All subclasses of Location share this property, 12 which can be used to indicate how the region specified by the SequenceAnnotation and any associated double-13 stranded Component is oriented on the elements of a Sequence from their parent ComponentDefinition. Table 8 14 provides a list of REQUIRED orientation URIs. If a Location object has an orientation, then it MUST come 15 from Table 8.    Note that the index of the first location is 1, as is typical practice in biology, rather than 0, as is typical practice in 4 computer science.

5
The start property 6 The start property specifies the inclusive starting position of the Range. This property is REQUIRED and MUST 7 contain an Integer value greater than zero. The end property specifies the inclusive ending position of the Range. This property is REQUIRED and MUST 10 contain an Integer value greater than zero. In addition, this Integer value MUST be greater than or equal to that 11 of the start property.

13
The serialization of a Range MUST have the following form: The Cut class has been introduced to enable the specification of a region between two discrete positions. This 37 specification is accomplished using the at property, which specifies a discrete position that that corresponds to the 38 index of a character in the elements String of a Sequence (except in the case when at is equal to zero-see below).

39
The at property 40 The at property is REQUIRED and MUST contain an Integer value greater than or equal to zero. The region 41 specified by the Cut is between the position specified by this property and the position that immediately follows 42 it. When the at property is equal to zero, the specified region is immediately before the first discrete position or 43 character in the elements String of a Sequence. The example below shows the serialization of a Cut object. It specifies a region in between positions 10 and 11, with 8 an orientation of "inline." 9 10 <sbol:Cut rdf:about="http://partsregistry.org/cd/BBa_J23119/cutat10/cut"> The serialization of a GenericLocation MUST have the following form: The example below shows the serialization of a GenericLocation object with an orientation of "reverse comple-  The subject property 48 The subject property is REQUIRED and MUST contain a URI that refers to a Component contained by the same 49 parent ComponentDefinition that contains the SequenceConstraint. The object property is REQUIRED and MUST contain a URI that refers to a Component contained by the same 2 parent ComponentDefinition that contains the SequenceConstraint. This Component MUST NOT be the same 3 Component that the SequenceConstraint refers to via its subject property.

4
The restriction property 5 The restriction property is REQUIRED and has a data type of URI. This property MUST indicate the type of 6 structural restriction on the positions, orientations, or structural identities of the subject and object Component 7 objects in relation to each other. The URI value of this property SHOULD come from the RECOMMENDED URIs in 8   Table 9. The position of the subject Component MUST precede that of the object Component. If each one is associated with a SequenceAnnotation, then the SequenceAnnotation associated with the subject Component MUST specify a region that starts before the region specified by the SequenceAnnotation associated with the object Component.

15
The serialization of a SequenceConstraint MUST have the following form: The example below shows the serialization of a SequenceConstraint belonging to the ComponentDefinition 9 of a LacI-repressible promoter. This SequenceConstraint has a "precedes" restriction that indicates that the 10 subject Component, which represents the core of the promoter, is positioned before the object Component, which 11 represents the LacI operator of the promoter.  The meta-data provided by the Model class include the following properties: the source or location of the actual 27 content of the model, the language in which the model is implemented, and the model's framework.

28
The source property 29 The source property is REQUIRED and MUST contain a URI reference to the source file for a model.

30
The language property 31 The language property is REQUIRED and MUST contain a URI that specifies the language in which the model is 32 implemented. It is RECOMMENDED that this URI refer to a term from the EMBRACE Data and Methods (EDAM) 33 ontology. Table 10 provides a list of terms from this ontology and their URIs. If the language property of a Model is well-described by one these terms, then it MUST contain the URI for this term as its value.   The framework property 5 The framework property is REQUIRED and MUST contain a URI that specifies the framework in which the model 6 is implemented. It is RECOMMENDED this URI refer to a term from the modeling framework branch of the SBO 7 when possible. A few suggested modeling frameworks and their corresponding URIs are shown in Table 11. If the 8 framework property of a Model is well-described by one these terms, then it MUST contain the URI for this term as 9 its value.

14
The serialization of a Model MUST have the following form: The example below shows the serialization of a Model object that refers to a quantitative model of a genetic toggle 24 switch. The model is implemented in the SBML language and adheres to a continuous modeling framework. Lastly The ModuleDefinition class represents a grouping of structural and functional entities in a biological design. The 42 primary usage of this class is to assert the molecular interactions and abstract function of its child entities. ModuleDefinition objects can be more abstract and represent entities of engineering design rather than biology, 4 they can have designated "inputs" and "outputs" expressed by the direction properties on its 5 FunctionalComponent objects.

6
The roles property 7 The roles property is an OPTIONAL set of URIs that clarifies the intended function of a ModuleDefinition. 8 These URIs might identify descriptive biological roles, such as "metabolic pathway" and "signaling cascade," but 9 they can also identify identify "logical" roles, such as "inverter" or "AND gate", or other abstract roles for describing 10 the function of design. Interpretation of the meaning of such roles currently depends on the software tools that read 11 and write them.

12
The modules property 13 The modules property is OPTIONAL and MAY specify a set of Module objects contained by the ModuleDefinition.
14 Note that the set of relations between Module and ModuleDefinition objects is strictly acyclic.

15
While the ModuleDefinition class is analogous to a specification sheet for a system of interacting biological 16 elements, the Module class represents the occurrence of a particular subsystem within the system. Hence, this 17 class allows a system design to include multiple instances of a subsystem, all defined by reference to the same The Interaction class provides an abstract, machine-readable representation of entity behavior within a 5 ModuleDefinition (whereas a more detailed model of the system might not be suited to machine reasoning, 6 depending on its implementation). Each Interaction contains Participation objects that indicate the roles of 7 the FunctionalComponent objects involved in the Interaction. The models property is OPTIONAL and MAY specify a set of URI references to Model objects. 10 Model objects are placeholders that link ModuleDefinition objects to computational models of any format. A

11
ModuleDefinition object can link to more than one Model since each might encode system behavior in a different 12 way or at a different level of detail.

14
The serialization of ModuleDefinition has the following form: The example below shows a simple ModuleDefinition containing two components, a FunctionalComponent for 32 a DNA sequence encoding constitutive expression of GFP and another for the GFP protein expressed from this 33 sequence, plus an interaction describing that relation. The Module class represents the usage or occurrence of a ModuleDefinition within a larger design (that is, another The definition property 7 The definition property is a REQUIRED URI that refers to the ModuleDefinition for the Module. 8 The definition property MUST NOT refer to the same ModuleDefinition as that which contains the Module. ModuleDefinition objects X and Y . The reference chain "X contains A, A is defined by Y , Y contains B , and B is 12 defined by X " is cyclical.

13
The mapsTo property 14 The mapsTos property is an OPTIONAL set of MapsTo objects that refer to and link ComponentInstance objects  The serialization of Modules has the following form.

Section 7.9 ModuleDefinition
The example below specifies a TetR inverter that is being used as a part of a genetic toggle switch: The ModuleDefinition describes how the that describes how the FunctionalComponent interacts with others and 10 summarizes their aggregate function.

11
The FunctionalComponent class inherits from the ComponentInstance class and therefore has the definition, 12 access, and mapsTos properties. In addition, it has a direction property that specifies whether it serves as an 13 input, output, both, or neither with regards to the ModuleDefinition that contains it.
14 The direction property 15 Each FunctionalComponent MUST specify via the direction property whether it serves as an input, output, both, 16 or neither for its parent ModuleDefinition object. The value for this property MUST be one of the URIs given in 17 Indicates that the FunctionalComponent is neither an input or output. The direction property is a means to encode how a designer thinks about the "purpose" of a connection in a 24 system. In SBOL, such a connection is represented with a FunctionalComponent, and a system is represented as 25 with a ModuleDefinition. For example, consider a system that has been designed to sense the concentration of 26 the cell-to-cell signaling molecule 3OC 6 HSL and report it via the concentration of another gene product. In this 27 system, the concentration of 3OC 6 HSL is being sensed by the system, so the FunctionalComponent for 3OC 6 HSL 28 would have a direction of "input." In turn, the concentration of the reporter gene product is intended to be 29 read/consumed by other biological systems, so the FunctionalComponent for this product would have a direction 30 of "output." The CDS encoding the product, however, is not intended to directly transfer information into or out of 31 the ModuleDefinition for the system, so its FunctionalComponent would have a direction of "neither."

33
The serialization of a FunctionalComponent has the following form. The Interaction class provides more detailed description of how the FunctionalComponent objects of a 9 ModuleDefinition are intended to work together. For example, this class can be used to represent different forms 10 of genetic regulation (e.g., transcriptional activation or repression), processes from the central dogma of biology 11 (e.g. transcription and translation), and other basic molecular interactions (e.g., non-covalent binding or enzymatic 12 phosphorylation). Each Interaction includes a types property that refers to descriptive ontology terms and a 13 participations property that describes which FunctionalComponent objects participate in the Interaction.
14 The types property 15 The types property is a REQUIRED set of URIs that describes the behavior represented by an Interaction. 16 The types property MUST contain one or more URIs that MUST identify terms from appropriate ontologies. It is 17 RECOMMENDED that exactly one URI contained by the types property refer to a term from the occurring entity  If an Interaction is well described by one of the terms from Table 13, then its types property MUST contain the 29 URI that identifies this term. Lastly, if the types property of an Interaction contains multiple URIs, then they 30 MUST identify non-conflicting terms. For example, the SBO terms "stimulation" and "inhibition" would conflict.

Section 7.9 ModuleDefinition
The participations property 1 The participations property is an OPTIONAL and MAY contain a set of Participation objects, each of which 2 identifies the roles that its referenced FunctionalComponent plays in the Interaction. 3 Even though an Interaction generally contains at least one Participation, the case of zero Participation 4 objects is allowed because it is plausible that a designer might want to specify that an Interaction will exist, even 5 if its participants have not yet been determined. The serialization of an Interaction has the following form. The example below shows an Interaction representing an inhibition relationship (SBO:0000169) between a 21 repressor (SBO:0000020, full Participation details shown) and a promoter: its referenced FunctionalComponent) in the context of its parent Interaction. 5 The roles property MUST contain one or more URIs that MUST identify terms from appropriate ontologies. It is 6 RECOMMENDED that exactly one URI contained by the roles property refer to a term from the participant role 7 branch of the SBO.  If a Participation is well described by one of the terms from Table 14, then its roles property MUST contain the 23 URI that identifies this term. Also, if a Participation belongs to an Interaction that has a type listed in Table 13, 2.0.1 24 then the Participation SHOULD have a role that is cross-listed with this type in Table 14. Lastly, if the roles 25 property of a Participation contains multiple URIs, then they MUST identify non-conflicting terms. For example, 26 the SBO terms "stimulator" and "inhibitor" would conflict.

27
The participant property 28 The participant property MUST specify precisely one FunctionalComponent object that plays the designated 29 role in its parent Interaction object. The serialization of Participation objects has the following form.

Section 7.10 Collection
In the example below, the role of participating FunctionalComponent is defined to be inhibitor, using the 1 SBO:0000020 term. This component is specified using the participant property of the Participation entity. The Collection class is a class that groups together a set of TopLevel objects that have something in common.

10
Some examples of Collection objects: 11 ■ Results of a query to find all ComponentDefinition objects in a repository that function as promoters. The members property 16 The members property of a Collection is OPTIONAL and MAY contain a set of URI references to zero or more 17 TopLevel objects.

19
The serialization of a Collection has the following form: The example below shows the serialization of a Collection object grouping together a library of constitutive 27 promoters.   The purpose of the CombinatorialDerivation class is to specify combinatorial genetic designs without having to The template property 16 The template property is REQUIRED and MUST contain a URI that refers to a ComponentDefinition. This

17
ComponentDefinition is expected to serve as a template for the derivation of new ComponentDefinition objects. 18 Consequently, its components property SHOULD contain one or more Component objects that describe its sub-19 structure (referred to hereafter as template Component objects), and its sequenceConstraints property MAY also 20 contain one or more SequenceConstraint objects that constrain this substructure.

21
When a ComponentDefinition is derived in accordance with a CombinatorialDerivation, the wasDerivedFroms 22 property of the derived ComponentDefinition SHOULD refer to the CombinatorialDerivation. When multiple 23 ComponentDefinition objects are derived in accordance with the same CombinatorialDerivation, they MAY 24 be referred to by the members property of a Collection, in which case the wasDerivedFroms property of the 25 Collection SHOULD also refer to this CombinatorialDerivation.

26
If the types property of the template ComponentDefinition contains one or more URIs, then the types property of 27 the derived ComponentDefinition SHOULD also contain those URIs. The same holds true for the roles properties 28 of these ComponentDefinition objects.

29
The variableComponents property 30 The variableComponents property is OPTIONAL and MAY contain a set of VariableComponent objects. These

Section 7.11 CombinatorialDerivation
MUST NOT contain two or more VariableComponent objects that refer to the same template Component via their 1 variable properties. 2 If the variable property of one of these VariableComponent objects refers to a template Component, then the 3 components property of the derived ComponentDefinition SHOULD contain as many Component objects derived 4 from the template Component as specified by the operator property of the VariableComponent (see Table 16). In ad-5 dition, the definition properties of these derived Component objects MUST refer to ComponentDefinition objects 6 specified by the variants, variantCollections, or variantDerivations property of the VariableComponent. 7 If no variable property of one of these VariableComponent objects refers to a template Component, then the 8 components property of the derived ComponentDefinition SHOULD contain exactly one Component with a 9 wasDerivedFroms property that refers to the template Component. The definition property of this derived 10 Component MUST refer to the ComponentDefinition referred to by the definition property of the template 11 Component.

12
Finally, all of these derived Component objects MUST follow the restriction properties of any 13 SequenceConstraint objects that refer to their corresponding template Component objects.
14 The strategy property 15 The strategy property is OPTIONAL and has a data type of URI.    The variable property is REQUIRED and MUST contain a URI that refers to a template Component in the template

Section 7.11 CombinatorialDerivation
CombinatorialDerivation referred to by the variantDerivations property of the VariableComponent.

1
If the roles property of the template Component contains one or more URIs, then the roles property of the derived 2 Component SHOULD also contain those URIs.

3
The variants property 4 The variants property is OPTIONAL and MAY contain zero or more URIs that each refer to a ComponentDefinition. 5 This property specifies individual ComponentDefinition objects to serve as options when deriving a new 6 Component from the template Component. The variantCollections property 8 The variantCollections property is OPTIONAL and MAY contain zero or more URIs that each refer to a 9 Collection. The members property of each Collection referred to in this way MUST NOT be empty. This property 10 enables the convenient specification of existing groups of ComponentDefinition objects to serve as options when 11 deriving a new Component from the template Component. The operator property 23 The operator property is REQUIRED and has a data type of URI. This property specifies how many Component 24 objects SHOULD be derived from the template Component during the derivation of a new ComponentDefinition.

25
The URI value of this property MUST come from the URIs provided in Table 16.    The serialization of an Implementation has the following form: 2.2.0 In the example below, an Implementation links back to its target design via the wasDerivedFroms property. Since 9 this particular sample did not match its target structure when synthesized in the lab, a ComponentDefinition 10 representing its mutated structure is linked by the built field. The purpose of the Attachment class is to serve as a general container for data files, especially experimental data 23 files. It provides a means for linking files and metadata to SBOL designs.

24
The meta-data provided by the Attachment class include the following properties: the source or location of the 25 actual file of the attachment, the format of the file, the size of the file, and the hash for the file.

26
The source property 27 The source property is REQUIRED and MUST contain a URI reference to the source file.

28
The format property 29 The format property is OPTIONAL and MAY contain a URI that specifies the format of the attached file. It is 30 RECOMMENDED that this URI refer to a term from the EMBRACE Data and Methods (EDAM) ontology.

31
The size property 32 The size property is OPTIONAL and MAY contain a long indicating the file size in bytes. The serialization of an Attachment MUST have the following form:   The purpose of the ExperimentalData class is to aggregate links to experimental data files. An ExperimentalData 35 is typically associated with a single sample, lab instrument, or experimental condition and can be used to describe 36 the output of the test phase of a design-build-test-learn workflow. For an example of the latter, see Figure 36.

37
As shown in Figure 24, the ExperimentalData class aggregates links to experimental data files using the OPTIONAL 38 attachments property that it inherits from the TopLevel class.   The purpose of the Experiment class is to aggregate experimental data sets for subsequent analysis, usually in 26 accordance with an experimental design.

27
As shown in Figure 24, the Experiment class aggregates ExperimentalData objects via the experimentalData 28 property.

29
The experimentalData property 30 The experimentalData property is OPTIONAL and MAY contain a set of URI references to ExperimentalData 31 objects. The same ExperimentalData MAY be referred to by more than one Experiment in this way.  context and design performance metrics) lack a clear consensus on their proper representation. In addition, some 23 types of biological data are not directly relevant to design and are therefore outside of the scope of SBOL.

24
To enable representation of these data, SBOL allows developers to embed custom data within SBOL objects and 25 documents, such that these data can be exchanged without being damaged or lost. This annotation and extension 26 mechanism is designed to enable new types of data to be easily incorporated into the SBOL standard once there is 27 community consensus on their proper representation.

28
Several methods are supported for connecting the SBOL data model with other types of application-specific data:

29
■ Custom data can be added to an SBOL object by annotating that object with non-conflicting properties. These 30 properties could contain literal data types such as Strings or URIs that require a resolution mechanism 31 to obtain external data. An example is annotating a ComponentDefinition with a property that contains a 32 String description and URI for the parts registry from which its source data was originally imported.

33
■ Custom data in the form of independent objects can be added to an SBOL document by creating 34 GenericTopLevel objects and annotating them as described above. An example is a GenericTopLevel object 35 that is annotated such that it represents a data sheet that describes the performance of a  Figure 26: Diagram of the Annotation class and its association with Identified and AnnotationValue objects, which is used for annotating SBOL entities with application specific data.

Serialization 1
The serialization of an Annotation has the following form:  The qName property specifies a namespace, prefix, and local part/name. The use of such qualified names is described 23 in detail by the W3C (http://www.w3.org/TR/1999/REC-xml-names-19990114/#ns-using). Essentially, the 24 "xmlns" property defines the prefix String to use as an alias for the namespace. The prefix can be any String. Its 25 use is OPTIONAL, since it simply replaces the full namespace, thereby making the serialization easier for a human 26 to read.

27
The first form of Annotation shown above is for an Annotation that contains a literal as its value. The second 28 form is for an Annotation that contains a URI as its value. Finally, the third form is for an Annotation that 29 contains a NestedAnnotations object as its value. In the last case, the nestedQName property specifies the nested 30 namespace, nested prefix, and nested local part/name, while the nestedURI property species the URI for the 31 NestedAnnotations object.

32
The example below shows how the serialization for a promoter ComponentDefinition can be annotated with 33 custom data. Annotations are added containing the relevant information from the iGEM Parts Registry. Each 34 property serialization of an Annotation is qualified with the http://www.partsregistry.org/ namespace, which 35 is prefixed using pr. The first Annotation is named pr:group. It specifies the iGEM group that has designed the 36 promoter and has a String value. The second Annotation is named pr:experience. It contains a URI value that 37 is serialized as an RDF resource and can be resolved to the information Web page on the Parts Registry for the 38 promoter. Finally, the third Annotation is named pr:information. It contains a NestedAnnotations object that

Section 7.16 Annotation and Extension of SBOL
is serialized as shown and includes information about the regulatory details of the promoter using Annotations 1 that correspond to Parts Registry categories.

3
Custom data can also be embedded at the top level of an SBOL document. The GenericTopLevel class is used 4 to represent top-level entities whose purpose is to contain a set of annotations that are independent of any other 5 class of SBOL object. Entities that have independent existence and are not recognized by the SBOL standard are 6 deserialized to GenericTopLevel objects. These GenericTopLevel objects can be safely used by tools to exchange 7 non-SBOL data. 8 As with other TopLevel objects, GenericTopLevel objects MAY include the properties displayId, name, 9 description, etc. The type of data annotating a GenericTopLevel object is indicated using the REQUIRED 10 rdfType property, which MUST contain a QName. As before with the qName property, the rdfType property is used 11 to set the namespace, prefix, and local part/name during serialization.  This section illustrates how to use the SBOL data model by specifying the design of a LacI/TetR toggle switch similar 2 to those constructed in Gardner et al. (2000). This design is visualized conceptually in Figure 29 and in detail in 3 Figure 30. 4 Conceptually, the toggle switch is constructed from two mutually repressing genes. With repressors LacI and TetR, 5 this results in a bi-stable system that will tend to settle into a state where precisely one of the two repressors is 6 strongly expressed, repressing the other. Each of these repressors can have its activity disrupted by a small molecule 7 (IPTG for LacI, aTc for TetR), which enables the system to be "toggled" from one state to the other by dosing it with 8 the appropriate small molecule. The LacI/TetR toggle switch is modeled in SBOL as two parallel hierarchies of structure and function. The structural 10 hierarchy of the toggle switch is represented using ComponentDefinitions: 11 ■ The base elements of the hierarchy are DNA components, transcription factor proteins, and small molecules.

12
As an example, Figure 31 is a UML diagram of the ComponentDefinition objects that represent these ele-13 ments.
14 ■ Base elements are composed to form more complex structures at the top of the hierarchy, including genes 15 and non-covalent complexes between transcription factor proteins and small molecules. As an example,  TetR-dependent repression of LacI (the TetR inverter). As an example, Figure 33 is a UML diagram of the 21 ModuleDefinition that represents the LacI inverter.  ComponentDefinition objects for the LacI inverter. These include ComponentDefinition objects based on DNA parts from the iGEM Parts Registry and ComponentDefinition objects that represent TetR mRNA, TetR, LacI, and IPTG. Each ComponentDefinition is associated with a Sequence that has an IUPAC DNA/RNA or IUPAC protein encoding, except the ComponentDefinition of IPTG, which is associated with a Sequence that has a SMILES encoding. In the case of the ComponentDefinition that represents the TetR gene, its sub-Component objects are located as Ranges along its Sequence using SequenceAnnotation objects. The ComponentDefinition that represents the IPTG-LacI complex, however, has no Sequence and its sub-Component objects are composed without any data about their relative positions.  In order for SBOL objects to be readily stored and exchanged, it is important that they are able to be serialized, i.e., 2 converted to a sequence of bytes that can be stored in a file or exchanged over a network. The serialization format 3 for SBOL is designed to meet several competing requirements. First, SBOL needs to support ad-hoc annotations and 4 extensions. Second, SBOL needs to support processing by general database and semantic web software tools that 5 have little or no knowledge of the SBOL data model. Finally, it ought to be relatively simple to write a new software 6 implementation, so that SBOL can be readily used even in software environments where community-maintained 7 implementations are not available. 8 To meet these goals, the canonical serialization of SBOL has been selected to be a strict dialect of RDF/XML Beckett Sequence, however, is also TopLevel and is therefore not nested within and instead linked via a URI.

46
Each instance of a first-class SBOL data type MAY have annotations attached, as described in Section 7.16. These

Section
being the value of that annotation. Annotation values are always nested within the RDF/XML serialization of 1 the instance that they annotate. For example, a ModuleDefinition might add a DOI annotation that links to the 2 scientific article that first described the system that it represents.
3 SBOL also supports top-level, user-defined data, again as described in Section 7.16. This is to allow non-standardized 4 but necessary information to be carried around as part of a design. For example, a particular sub-community can be human-examined by using XML formatting tools and syntax highlighters. It should be noted that XML was 16 chosen for no particular reason than the simple fact of its widespread use in the community and by existing libraries.

17
The proposers are aware that many of these benefits apply to several serializations of SBOL. The choice of XML is 18 simply a codification of the existent standard and practice.

19
By adopting this paradigm of RDF/XML serialization, SBOL is able to adapt to future changes in the standard without 20 requiring large-scale alterations to the RDF files. Since exactly the same scheme is used to serialize annotations as is 21 used to serialize specification-defined properties and associations, it is possible to update the SBOL standard to

15
The relationship between the old and new objects (i.e., that the new object was derived from the old object), however, 16 is not visible unless it is explicitly declared. This is RECOMMENDED to be done using the persistentIdentity, and 17 version properties. The preferred practice for declaring such a relationship is to use the same persistentIdentity 18 for both objects, but give the newer object a later version. Then, when the new object is published, it can be clear to 19 both humans and machines that this object is intended to update the previously published object. In this way, when 20 a user wants the latest version of an object, they can obtain it by referencing the object via its persistentIdentity 21 and rely on a tool to find the object with that persistentIdentity and the latest version.

27
Maintaining unique identity URIs for all SBOL objects can be a very challenging implementation task. To reduce 28 this burden, users of SBOL 2.x are encouraged to follow a few simple rules when constructing the identity 29 properties and related properties for SBOL objects. When these rules are followed in constructing an SBOL object, 30 we say that this object is compliant. These rules are as follows: 31 1. The identity of a compliant SBOL object MUST begin with a URI prefix that maps to a domain over which 32 the user has control. Namely, the user can guarantee uniqueness of identities within this domain.    When annotating an SBOL document with additional information, there are two general methods that can be used: 9 ■ Embed the information in the SBOL document, either as non-SBOL properties or GenericTopLevel objects.

10
■ Store the information separately and annotate the SBOL document with URIs that point to it.

11
In theory, either method can be used in any case. (Note that a third case not discussed here is to use SBOL to 12 annotate external objects with linking to SBOL documents, rather than annotate SBOL documents with links 13 external objects.)

14
In practice, embedding massive amounts of non-SBOL data into SBOL documents is likely to cause problems 15 for people and software tools trying to manage and exchange such documents. Therefore, it is RECOMMENDED 16 that small amounts of information (e.g., design notes or preferred graphical layout) be embedded in the SBOL 17 model, while large amounts of information (e.g., the contents of the scientific publication from which a model was 18 derived or flow cytometry data that characterizes performance) be linked with URIs pointing to external resources.

19
The boundary between "small" and "large" is left deliberately vague, recognizing that it will likely depend on the 20 particulars of a given SBOL application.  Provenance is central to a range of quality control and attribution tasks within the Synthetic Biology design process.

4
Tracking attribution and derivation of one resource from another is paramount for managing intellectual property applications, but this flexibility also presents an obstacle to standardized data exchange. Therefore a simple ontology 36 (see Table 20 and Table 19) has been adopted to describe common provenance connections expected in synthetic 37 biology workflows, based on the design-build-test-learn formalization of engineering.

38
The design-build-test-learn cycle is a common theme in synthetic biology and engineering literature. This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported License (http://creativecommons.org/licenses/by-nc-nd/3.0/).

Section 13.1 Adding Provenance with PROV-O
It is expected that users will develop their own ontology terms to specify how SBOL objects are used in a recipe, 1 protocol, or computational analysis. However, these home-made ontologies will be very domain specific, and may 2 not be intelligible to users working in another domain. For example a modeler should not be expected to understand 3 an ontology of Usage roles for DNA assembly. The terms "design", "build", "test", and "learn" provide a high level 4 workflow abstraction that allows tool-builders to quickly search for and isolate provenance histories relevant to their biology. An example of how these terms are used is provided in Figure 36. ...the completion of production of a new entity by an activity. This entity did not exist before generation 10 and becomes available for usage after this generation.

11
These semantics are somewhat different from the versioning semantics defined in section Section 7.4. The SBOL 12 specification defines a new version of an object as an update of a previously published object (and therefore a 13 previously existing object). Therefore, an SBOL object which is "generated" from another SHOULD BE regarded 14 as a new entity rather than a new version of an existing entity. However, this distinction is somewhat subjective 15 (see Theseus's paradox). Therefore, we RECOMMEND as a best practice that objects linked by Activities not be 16 successive versions of each other, though this is left to the discretion of users and library developers.  A generated Entity is linked through a wasGeneratedBy relationship to an Activity, which is used to describe how 2 different Agents and other entities were used. An Activity is linked through a associations to Associations, to 3 describe the role of agents, and is linked through usages to Usages to describe the role of other entities used as 4 part of the activity. Moreover, each Activity includes optional startedAtTime and endedAtTime properties. When 5 using Activity to capture how an entity was derived, it is expected that any additional information needed will be 6 attached as annotations. This may include software settings or textual notes. Activities can also be linked together 7 using the wasInformedBys relationship to provide dependency without explicitly specifying start and end times. The types property is an OPTIONAL set of URIs that explicitly specify the type of the provenance Activity in more 10 detail. If specified, it is RECOMMENDED that at least one URI of the types property of an Activity refers to a URI 11 from  The startedAtTime property 18 The startedAtTime property is OPTIONAL and contains a DateTime (see section Section 12.7) value, indicating 19 when the activity started. If this property is present, then the endedAtTime property is REQUIRED.

20
The endedAtTime property 21 The endedAtTime property is OPTIONAL and contains a DateTime (see section Section 12.7) value, indicating when 22 the activity ended.

23
The associations property 24 The associations property is OPTIONAL and MAY contain a set of URIs that refers to Association objects.

25
The usages property 26 The usages property is OPTIONAL and MAY contain a set of URIs that refers to Usage objects.

27
The wasInformedBys property 28 The wasInformedBys property is OPTIONAL and MAY contain a set of URIs that refers to other Activity objects. How different entities are used in an Activity is specified with the Usage class, which is linked from an Activity 11 through the Usage relationship. A Usage is then linked to an Entity through the Entity's URI and the role of this 12 entity is qualified with the roles property. When the wasDerivedFroms property is used together with the full 13 provenance described here, the entity pointed at by the wasDerivedFroms property MUST be included in a Usage.
14 The entity property 15 The entity property is REQUIRED and MUST contain a URI which MAY refer to an SBOL Identified object. 16 The roles property 17 2.2.0 The roles property is an OPTIONAL set of URIs that refer to particular term(s) describing the usage of an entity 18 referenced by the entity property. Recommended terms that are defined in Table 19 can be used to indicate how 19 the referenced entity is being used in this Activity. http://sbols.org/v2#design Design describes the process by which a conceptual representation of an engineer's imagined and intended design for a biological system is derived, possibly from a predictive model or by modifying a pre-existing design. In the context of a Usage, the term indicates that the referenced entity was generated by some previous design Activity and was used by the present Activity as a design for a new object. 23 http://sbols.org/v2#build Build describes the process by which a biological construct, sample, or clone is implemented in the laboratory. In the context of a Usage, the term indicates that the referenced entity was generated by some previous build Activity and was used by the present Activity as a built object. 24 http://sbols.org/v2#test Test describes the process of performing experimental measurements to characterize a synthetic biological construct. In the context of a Usage, the term indicates that the referenced entity was generated by some previous test Activity and is used as test data in the present Activity. Learn describes the process of analyzing experimental measurements to produce a new entity that represents biological knowledge. In the context of a Usage, the term indicates that the referenced entity was generated by some previous learn Activity and is used in the present Activity as a source of scientifically verified knowledge.  The agent property is REQUIRED and MUST contain a URI that refers to an Agent object.

2
The roles property 3 2.2.0 The roles property is an OPTIONAL set of URIs that refers to particular term(s) that describes the the role of the 4 agent in the parent Activity. The recommended terms that are defined in Table 20 can be used to specify the kind 5 of Activity performed by the Agent.

http://sbols.org/v2#learn
Learn describes the process of analyzing the experimental measurements in order to produce a new entity that represents biological knowledge. In the context of an Association, the Agent processed the raw experimental data to produce an analysis. This process generates a new entity that represents biological knowledge, including tables or graphs referenced by the Attachments of an ExperimentalData, a Model produced by a fitting process, a consensus Sequence derived from sequencing results, etc. The plan property is OPTIONAL and contains a URI that refers to a Plan.

14
The serialization of an Association MUST have the following form: The serialization of an Usage MUST have the following form: information, such as software version, needed to be able to run the same software again.

15
The serialization of an Agent MUST have the following form:      This example illustrates how the Prov ontology should be used to reference to link a generated design to the combi-9 natorial derivation that it was generated from. In this example, there is a top-level derivation (Promoter_Derivation) 10 which specifies two possible promoters for this design, as well as an additional derivation (Terminator_Derivation) 11 to be used for the Gen_Component. The second derivation (Terminator_Derivation) specifies two possible ter-12 minators to used within the Gen_Component.  Unit, which may or may not have a Prefix (e.g. centi, milli, micro, etc.). As these classes are adopted by SBOL,

24
Measure is treated as a subclass of Identified, while Unit and Prefix are treated as subclasses of TopLevel. In 25 addition, SBOL adopts the following OM Unit subclasses: SingularUnit, CompoundUnit, UnitMultiplication,

28
SBOL-compliant tools are allowed to read, write, and modify data belonging to OM classes other than those 29 described here, but this specification does not provide any guidance for the interpretation or use of these data in 30 the context of SBOL. The purpose of the Measure class is to link a numerical value to a Unit.

33
The hasNumericalValue property 34 The hasNumericalValue property is REQUIRED and MUST contain a single xsd:float.

35
The hasUnit property 36 The hasUnit property is REQUIRED and MUST contain a URI that refers to a Unit. The OM provides URIs for many 37 existing instances of the Unit class for reference (for example, http://www.ontology-of-units-of-measure.

39
The types property 40 The types property is OPTIONAL and MAY contain a set of URIs. It is RECOMMENDED that one of these URIs iden-

Unit
1 As adopted by SBOL, Unit is an abstract class that is extended by other classes to describe units of measure using a 2 shared set of properties. 3 The symbol property 4 The symbol property is REQUIRED and MUST contain a String. This String is commonly used to abbreviate the 5 unit of measure's name. For example, the unit of measure named "gram per liter" is commonly abbreviated using 6 the String "g/l".

7
The alternativeSymbols property 8 The alternativeSymbols property is OPTIONAL and MAY contain a set of Strings. This property can be used to 9 specify alternative abbreviations other than that specified using the symbol property.

10
The label property 11 The label property is REQUIRED and MUST contain a String. This String is a common name for the unit of 12 measure and SHOULD be identical to any String contained by the name property inherited from Identified.

13
The alternativeLabels property 14 The alternativeLabels property is OPTIONAL and MAY contain a set of Strings. This property can be used to 15 specify alternative common names other than that specified using the label property. 16 The comment property 17 The comment property is OPTIONAL and MAY contain a String. This String is a description of the unit of measure 18 and SHOULD be identical to any String contained by the description property inherited from Identified.

19
The longcomment property 20 The longcomment property is OPTIONAL and MAY contain a String. This String is a long description of the unit 21 of measure and SHOULD be longer than any String contained by the comment property.

23
The purpose of the SingularUnit class is to describe a unit of measure that is not explicitly represented as a 24 combination of multiple units, but could be equivalent to such a representation. For example, a joule is considered 25 to be a SingularUnit, but it is equivalent to the multiplication of a newton and a meter.

26
The hasUnit property 27 The hasUnit is OPTIONAL and MAY contain a URI. This URI MUST refer to another Unit. The hasUnit propery can 28 be used in conjunction with the hasFactor property to specify whether a SingularUnit is equivalent to another 29 Unit multiplied by a factor. For example, an angstrom is equivalent to 10 −10 meters.

30
The hasFactor property 31 The hasFactor property is OPTIONAL and MAY contain a xsd:float. If the hasFactor property of a SingularUnit 32 is non-empty, then its hasUnit property SHOULD also be non-empty.

34
The serialization of a SingularUnit MUST have the following form:  The purpose of the UnitMultiplication class is to describe a unit of measure that is the multiplication of two 32 other units of measure.

33
The hasTerm1 property 34 The hasTerm1 property is REQUIRED and MUST contain a URI that refers to another Unit. This Unit is the first 35 multiplication term.

36
The hasTerm2 property 37 The hasTerm2 property is REQUIRED and MUST contain a URI that refers to another Unit. This Unit is the second 38 multiplication term. It is okay if the Unit referred to by hasTerm1 is the same as that referred to by hasTerm2. The purpose of the UnitDivision class is to describe a unit of measure that is the division of one unit of measure 2 by another.

3
The hasNumerator property 4 The hasNumerator property is REQUIRED and MUST contain a URI that refers to another Unit.

5
The hasDenominator property 6 The hasDenominator property is REQUIRED and MUST contain a URI that refers to another Unit. The purpose of the UnitExponentiation class is to describe a unit of measure that is raised to an integer power.

43
The hasBase property 44 The hasBase property is REQUIRED and MUST contain a URI that refers to another Unit.

45
The hasExponent property 46 The hasExponent property is REQUIRED and MUST contain an xsd:integer. The purpose of the PrefixedUnit class is to describe a unit of measure that is the multiplication of another unit of 15 measure and a factor represented by a standard prefix such as "milli," "centi," "kilo," etc. 16 The hasUnit property 17 The hasUnit property is REQUIRED and MUST contain a URI that refers to another Unit.

18
The hasPrefix property 19 The hasPrefix property is REQUIRED and MUST contain a URI that refers to a Prefix.

Prefix
1 As adopted by SBOL, Prefix is an abstract class that is extended by other classes to describe factors that are 2 commonly represented by standard unit prefixes. For example, the factor 10 −3 is represented by the standard unit The symbol property 5 The symbol property is REQUIRED and MUST contain a String. This String is commonly used to abbreviate the 6 name of the unit prefix. For example, the String "m" is commonly used to abbreviate the name "milli."

7
The alternativeSymbols property 8 The alternativeSymbols property is OPTIONAL and MAY contain a set of Strings. This property can be used to 9 specify alternative abbreviations other than that specified using the symbol property.

10
The label property 11 The label property is REQUIRED and MUST contain a String. This String is a common name for the unit prefix 12 and SHOULD be identical to any String contained by the name property inherited from Identified.

13
The alternativeLabels property 14 The alternativeLabels property is OPTIONAL and MAY contain a set of Strings. This property can be used to 15 specify alternative common names other than that specified using the label property. 16 The comment property 17 The comment property is OPTIONAL and MAY contain a String. This String is a description of the unit prefix and 18 SHOULD be identical to any String contained by the description property inherited from Identified.

19
The longcomment property 20 The longcomment property is OPTIONAL and MAY contain a String. This String is a long description of the unit 21 of measure and SHOULD be longer than any String contained by the comment property.

22
The hasFactor property 23 The hasFactor property is REQUIRED and MUST contain an xsd:float.

25
The purpose of the SIPrefix class is to describe standard SI prefixes such as "milli," "centi," "kilo," etc.

27
The serialization of a SIPrefix MUST have the following form: The purpose of the BinaryPrefix class is to describe standard binary prefixes such as "kibi," "mebi," "gibi," etc. 2 These prefixes commonly precede units of information such as "bit" and "byte." The serialization of a BinaryPrefix MUST have the following form:    there are conditions under which it cannot be checked by a machine.

10
A star indicates a RECOMMENDED condition for following best practices. This rule is not strictly a matter of 11 SBOL conformance, but its recommendation comes from logical reasoning. If an SBOL document does not 12 follow this rule, it is still valid SBOL, but it might have degraded functionality in some tools. 13 We also include a fourth type of rule that represents a required condition for SBOL-compliance that cannot be 14 checked by a machine. Therefore, violations of these rules are not expected to be reported as errors by any of the 15 software libraries implementing SBOL 2.1 or above. It is the user's responsibility to make sure that these validation 16 rules are followed.

17
L A triangle indicates a weak REQUIRED condition for SBOL conformance. While this rule MUST be followed, it 18 is not possible in practice for a machine to automatically check whether the rule has been followed.

19
The validation rules listed in the following subsections are all believed to be stated or implied in the rest of this 20 specification document. They are enumerated here for convenience and to provide a "master checklist" for SBOL

24
For convenience and brevity, we use the shorthand "sbol:x" to stand for an attribute or element name x in the 25 namespace for the SBOL specification, using the namespace prefix sbol. In reality, the prefix string can be different 26 from the literal "sbol" used here (and indeed, it can be any valid XML namespace prefix that the software chooses). 27 We use "sbol:x" because it is shorter than to write a full explanation everywhere we refer to an attribute or element 28 in the SBOL specification namespace.

29
General rules for an SBOL document

sbol-10104
If an SBOL document includes any wasDerivedFrom properties, then it MUST declare the use 1 of the following XML namespace: 2 http://www.w3.org/ns/prov#. The identity property of an Identified object MUST be globally unique.

sbol-10212
The name property of an Identified object is OPTIONAL and MAY contain a String.

sbol-10213
The description property of an Identified object is OPTIONAL and MAY contain a String. The displayId property of a compliant Identified object is REQUIRED. Objects with the same persistentIdentity MUST be instances of the same class.

sbol-10225
An Identified with a wasGeneratedBys property that includes a reference to an Activity 1 with a child Association that has a roles property that contains the URI http://sbols. 2 org/v2#build SHOULD be an Implementation. An Identified with a wasGeneratedBys property that includes a reference to an Activity 5 with a child Association that has a roles property that contains the URI http://sbols. 6 org/v2#test SHOULD be an ExperimentalData.

sbol-10227
An Identified with a wasGeneratedBys property that includes a reference to an Activity 9 with a child Association that has a roles property that contains the URI http://sbols. 10 org/v2#learn SHOULD not be an Implementation. The attachments property of a TopLevel is OPTIONAL and MAY contain a set of URIs.

sbol-10402
The elements property of a Sequence is REQUIRED and MUST contain a String. Reference: Section 7.6 on page 21 2 sbol-10403 The encoding property of Sequence is REQUIRED and MUST contain a URI. The elements property of a Sequence MUST be consistent with its encoding property.

sbol-10503
The types property of a ComponentDefinition MUST NOT contain more than one URI from 24

sbol-10507
The roles property of a ComponentDefinition is OPTIONAL and MAY contain a set of URIs. The roles property of a ComponentDefinition MUST contain a URI from Table 4 if it is well-4 described by this URI.

5
Reference: Section 7.7 on page 22 The roles property of a ComponentDefinition SHOULD NOT contain a URI that refers to a 7 term from the sequence feature branch of the SO unless its types property contains the DNA 8 or RNA type URI listed in Table 2. If the sequences property of a ComponentDefinition refers to one or more Sequence objects, 24 and one of the types of this ComponentDefinition comes from chy that refer to Sequence objects with the same encoding, then the elements properties 40 of these Sequence objects SHOULD be consistent with each other, such that well-defined 41 mappings exist from the "lower level" elements to the "higher level" elements in accordance The sequenceAnnotations property of a ComponentDefinition is OPTIONAL and MAY con- The sequenceAnnotations property of a ComponentDefinition MUST NOT contain two or 11 more SequenceAnnotation objects that refer to the same Component.

sbol-10602
The definition property of a ComponentInstance is REQUIRED and MUST contain a URI.

sbol-10607
The access property of a ComponentInstance is REQUIRED and MUST contain a URI from 18

sbol-10706
The roles property of a Component SHOULD NOT contain a URI that refers to a term from the 1 sequence feature branch of the SO unless the types property of the ComponentDefinition 2 referred to by its definition property contains the DNA or RNA type URI listed in Table 2. If the types property of the ComponentDefinition referred to by its definition contains 5 the DNA or RNA type URI, then its roles property SHOULD contain no more than one URI 6 that refers to a term from the sequence feature branch of the SO. The roleIntegration property of a Component, if provided, MUST contain a URI from Table 6. The refinement property of a MapsTo is REQUIRED and MUST contain a URI from Table 7.

sbol-11002
The orientation property of a Location is OPTIONAL and MAY contain a URI from Table 8.

sbol-11102
The start property of a Range is REQUIRED and MUST contain an Integer greater than zero.

23
Reference: Section 7.7.5 on page 35 24 sbol-11103 The end property of a Range is REQUIRED and MUST contain an Integer greater than zero.

sbol-11104
The value of the end property of a Range MUST be greater than or equal to the value of its 27 start property.

sbol-11507
The language property of a Model SHOULD contain a URI that refers to a term from the EDAM 4 ontology.

5
Reference: Section 7.8 on page 38 6 sbol-11508 The framework property of a Model is REQUIRED and MUST contain a URI.

sbol-11602
The roles property is OPTIONAL and MAY contain a set of URIs. The modules property OPTIONAL and MAY contain a set of Module objects.

sbol-11605
The interactions property is OPTIONAL and MAY contain a set of Interaction objects.

sbol-11606
The functionalComponents property is OPTIONAL and MAY contain a set of 33 FunctionalComponent objects.

sbol-11607
The models property is OPTIONAL and MAY contain a set of URIs.

sbol-12007
Exactly one role in the set of roles SHOULD be a URI from the participant role branch of the 1 SBO (see Table 14).

sbol-13002
The operator property of a VariableComponent is REQUIRED and MUST contain a URI.

3
Reference: Section 7.11.1 on page 50 4 sbol-13003 The URI contained by the operator property of a VariableComponent MUST come from 5   Table 16.

sbol-13004
The variable property of a VariableComponent is REQUIRED and MUST contain a URI.

sbol-13605
The comment property of a Unit is OPTIONAL and MAY contain a String.

sbol-13705
If the hasFactor property of a SingularUnit is non-empty, then its hasUnit property 28 SHOULD also be non-empty.

sbol-13804
The URI contained by the hasTerm1 property of a UnitMultiplication MUST NOT be iden-1 tical to the URI contained by its identity property.

sbol-13805
The URI contained by the hasTerm2 property of a UnitMultiplication MUST NOT be iden-4 tical to the URI contained by its identity property.

sbol-14002
The hasNumerator property of a UnitDivision is REQUIRED and MUST contain a URI.

sbol-14003
The hasDenominator property of a UnitDivision is REQUIRED and MUST contain a URI.

sbol-14004
The URI contained by the hasNumerator property of a UnitDivision MUST NOT be identical 21 to the URI contained by its identity property.

sbol-14005
The URI contained by the hasDenominator property of a UnitDivision MUST NOT be iden-24 tical to the URI contained by its identity property.

sbol-14102
The hasBase property of a UnitExponentiation is REQUIRED and MUST contain a URI.

sbol-14104
The URI contained by the hasBase property of a UnitExponentiation MUST NOT be identi-1 cal to the URI contained by its identity property.

sbol-14105
The URI contained by the hasBase property of a UnitExponentiation MUST refer to a Unit.

sbol-14202
The hasUnit property of a PrefixedUnit is REQUIRED and MUST contain a URI.

12
Reference: Section 13.2.8 on page 98 13 sbol-14203 The hasPrefix property of a PrefixedUnit is REQUIRED and MUST contain a URI.

sbol-14204
The URI contained by the hasUnit property of a PrefixedUnit MUST NOT be identical to 16 the URI contained by its identity property.

sbol-14301
The symbol property of a Prefix is REQUIRED and MUST contain a String.

24
Reference: Section 13.2.9 on page 99 25 sbol-14302 The alternativeSymbols property of a Prefix is OPTIONAL and MAY contain a set of 26 Strings.

27
Reference: Section 13.2.9 on page 99 28 sbol-14303 The label property of a Prefix is REQUIRED and MUST contain a String.

sbol-14304
The alternativeLabels property of a Prefix is OPTIONAL and MAY contain a set of Strings.

sbol-14305
The comment property of a Prefix is OPTIONAL and MAY contain a String.

33
Reference: Section 13.2.9 on page 99 34 sbol-14306 The longcomment property of a Prefix is OPTIONAL and MAY contain a String.

35
Reference: Section 13.2.9 on page 99 36 sbol-14307 The hasFactor property of a Prefix is REQUIRED and MUST contain an xsd:float.

sbol-14309
If both of the description property and comment properties of a Prefix are non-empty, then 1 they SHOULD contain identical Strings.

sbol-14310
If both of the comment property and longcomment properties of a Prefix are non-empty, 4 then the String contained by the longcomment property SHOULD be longer than the String 5 contained by the comment property. This example shows the serialization of SequenceConstraint between two Component objects in a composite 47 promoter ComponentDefinition. In the example, the promoter ComponentDefinition has two sub-Components 48 that instantiate the ComponentDefinition objects for a core promoter region and and a binding site. The

49
SequenceConstraint specifies that the core promoter region precedes the binding site.  This example shows the serialization of application-specific data from Annotation objects. A

32
ComponentDefinition that represents a promoter is annotated with custom data on the promoter's sigma factor 33 and how it is regulated.