Implementación base de intercambio de datos en CONABIO
Objetivo
En lo siguiente se explicará la implemantación base para generar una especificación para la publicación e intercambio de datos de CONABIO.
Para esta implementación base consideraremos solo un subconjunto de los datos de CONABIO. Estos los elegimos debido a que consideramos que la implementación puede llevarse a cabo rápidamente.
Los conjuntos de datos que consideramos serán:
- Catálogo taxonómico,
- Base de individuos del SNIB.
Descripción funcional de los datos
En esta sección se especificará las consultas que deberá de poder responder los conjuntos de datos que se considerarán para la implementación base.
Cátalogo taxonómico
- Cuál es la información básica de taxón al que se refiere un IdTaxon
- Cuál es el IdTaxon asignado a un nombre científico de taxón
- Cuál es el nivel taxonómico de un taxón
- Cuáles son los sinónimos de un taxón
- Cuáles son los homónimos de un taxón
- Cuáles son los nombres anteriores (históricos) de un taxón
- Cuándo se creó un taxón
- Cuándo se actualizó un taxón
- Cuál es el estatus de la información taxonómica de un taxón
- Cuáles son los nombres comúnes de un taxón
- Cuál es la autoridad que nombró un taxón
- Cuál es la fecha en que se asignó el nombre actual de un taxón
- Cuáles son los niveles taxonómicos superiores de un taxón
- Cuáles son los niveles taxonómicos inferiores de un taxón
Base de individuos del SNIB
Cuál es la información asociada a un idejemplar
- A qué colección pertenece un registro
- Qué validez tiene este registro, tanto taxonómica, espacialmente, etc.
- Qué registros se tienen para un taxón
- Qué registros hay en una ubicación
- Qué registros hay contenidos en un polígono
- Ids de los taxones en una ubicación
- Ids de los taxones en un polígono
Conjuntos de datos
- La descripción de un conjunto de datos a partir de un ID
- Cuál es el identificador de un conjunto de datos
- Quién es el responsable de un conjunto de datos
- El conjunto de datos tiene información espacial, temporal
- Cuál es el schema de un conjunto de datos
Propuesta
Lo que se desea tener es una manera de poder conectar servicios a la información generada en CONABIO para esto proponemos generar un servicio de consulta que exponga la información que queremos. En esta implementación base consideramos que nuestros conjuntos de datos están relacionados y que cada uno de ellos es en algún sentido información atómica, esto es que cualquier proyección de la misma carece de relevancia.
Hay diversas maneras de generar este servicio de consulta, por ejemplo HATEOAS o GraphQL. Ambas maneras se encargan de proveer especificaciones para exponer y enlazar la información como nosotros buscamos hacerlo. En esta propuesta usaremos GraphQL, para esto definiremos el esquema de nuestros distintos tipos de datos para que sean implementados en el schema de GraphQL. Esta descripción de los datos puede ser consultada en los anexos Tipos de datos primitivos y Tipos de datos CONABIO.
Además de un bosquejo del esquema incial también se tomo la decisión de hacer la resolución de las consultas al servicio por medio de acceso compuesto, una breve discusión al respecto se puede leer en Resolución de búsquedas en GraphQL. Partiendo los distintos servicios conforme a nuestra propuesta de nodos de información.
Anexos
Consideraciones técnicas
Resolución de búsquedas en GraphQL
Se pueden identificar tres distintas maneras de proceder en la resolución de las búsquedas en GraphQL: mediante un API REST, accesando directamente a la información via la base de datos y un acceso compuesto.
Resolver las consultas usando un API REST es muy útil si se tiene los servicios ya generados para desplegar los datos, aunque esto implica mantener dos aplicaciones, el API REST y la capa de GraphQL. Una gran ventaja de implementar las consultas de esta manera es que los servicios REST pueden ser microservicios y con esto se podrían segmentar los responsables de los mismos.
Si en cambio se usa el desarrollo de la intefaz de búsqueda se hace sobre las bases de datos en donde se encuentra la información se tendrá que definir y matener un esquema el cual agrupe toda las caracteríticas de los distintos tipos de información que se quiera publicar. Claro que el rendimiento de la aplicación será el óptimos pero se vuelve complicado la separación de responsabilidades.
El acceso compuesto lo que significa es el de unir pequeños desarrollos en GraphQL, esto al igual que el acceso por REST implica la posibilidad de tener responsables de pequeños desarrollos. A diferencia de la implementación con REST la resolución de las consultas son delegadas a los pequeños desarrollos y no tienen que se implementadas en el desarrollo principal. Lo único que se desarrolla en la aplicación principal son las uniones de los distintos esquemas.
Tipos de datos primitivos
GeometryType
PointType
TemporalType
DCMI Period Encoding Scheme: specification of the limits of a time interval, and methods for encoding this in a text string.
Campo | Descripción | URL | Tipo |
---|---|---|---|
name | A name for the time interval | String | |
start | The instant corresponding to the commencement of the time interval. | DateType | |
end | The instant corresponding to the termination of the time interval | DateType |
UserType
Campo | Descripción | URL | Tipo |
---|---|---|---|
login | Nombre acceso del usuario | String | |
name | El nombre completo del usuario | String | |
department | Nombre del departamento del usuario | String | |
Correo electrónico de contacto | String | ||
phone | Número telefónico de contacto | String | |
datasets | Conjuntos de datos de los cuales es responsable | [DatasetType] |
BoxEncodingType
DCMI Box Encoding Scheme: specification of the spatial limits of a place, and methods for encoding this in a text string.
Campo | Descripción | URL | Tipo |
---|---|---|---|
name | A name for the place | String | |
northlimit | The constant coordinate for the nothernmost face or edge | Float | |
eastlimit | The constant coordinate for the easternmost face or edge | Float | |
southlimit | The constant coordinate for the southernmost face or edge | Float | |
westlimit | The constant coordinate for the westernmost face or edge | Float |
Tipos de datos CONABIO
Taxonómico (TaxonType)
Campo | Descripción | URL | Tipo |
---|---|---|---|
modified | The most recent date-time on which the resource was changed. | http://purl.org/dc/terms/modified | DateType |
taxonID | An identifier for the set of taxon information (data associated with the Taxon class). May be a global unique identifier or an identifier specific to the data set. | http://rs.tdwg.org/dwc/terms/t axonID | ID |
acceptedNameUsa geID | An identifier for the name usage (documented meaning of the name according to a source) of the currently valid (zoological) or accepted (botanical) taxon. | http://rs.tdwg.org/dwc/terms/a cceptedNameUsag eID | TaxonType |
parentNameUsage ID | An identifier for the name usage (documented meaning of the name according to a source) of the direct, most proximate higher-rank parent taxon (in a classification) of the most specific element of the scientificName. | http://rs.tdwg.org/dwc/terms/p arentNameUsageI D | TaxonType |
scientificName | The full scientific name, with authorship and date information if known. When forming part of an Identification, this should be the name in lowest level taxonomic rank that can be determined. This term should not contain identification qualifications, which should instead be supplied in the IdentificationQ ualifier term. | http://rs.tdwg.org/dwc/terms/scientificName | String |
nameAccordingTo | The reference to the source in which the specific taxon concept circumscription is defined or implied - traditionally signified by the Latin "sensu" or "sec." (from secundum, meaning "according to"). For taxa that result from identifications , a reference to the keys, monographs, experts and other sources should be given. | http://rs.tdwg.org/dwc/terms/nameAccordingTo | String |
namePublishedIn | A reference for the publication in which the scientificName was originally established under the rules of the associated nomenclaturalCo de. | http://rs.tdwg.org/dwc/terms/namePublishedIn | String |
namePublishedIn Year | The four-digit year in which the scientificName was published. | http://rs.tdwg.org/dwc/terms/namePublishedInY ear | Int |
taxonRank | The taxonomic rank of the most specific name in the scientificName. | http://rs.tdwg.org/dwc/terms/taxonRank | Enum |
scientificNameA uthorship | The authorship information for the scientificName formatted according to the conventions of the applicable nomenclaturalCo de. | http://rs.tdwg.org/dwc/terms/scientificNameAu thorship | String |
taxonomicStatus | The status of the use of the scientificName as a label for a taxon. Requires taxonomic opinion to define the scope of a taxon. Rules of priority then are used to define the taxonomic status of the nomenclature contained in that scope, combined with the experts opinion. It must be linked to a specific taxonomic reference that defines the concept. | http://rs.tdwg.org/dwc/terms/taxonomicStatus | Enum |
Individuos (OccurrenceType)
Campo | Descripción | URL | Tipo |
---|---|---|---|
modified | The most recent date-time on which the resource was changed. | http://purl.org/dc/terms/modified | DateType |
license | A legal document giving official permission to do something with the resource. | http://purl.org/dc/terms/license | String |
rightsHolder | A person or organization owning or managing rights over the resource. | http://purl.org/dc/terms/rightsHolder | String |
bibliographicCi tation | A bibliographic reference for the resource as a statement indicating how this record should be cited (attributed) when used. | http://purl.org/dc/terms/bibliographicCitatio n | String |
references | A related resource that is referenced, cited, or otherwise pointed to by the described resource. | http://purl.org/dc/terms/references | String |
collectionID | An identifier for the collection or dataset from which the record was derived. | http://rs.tdwg.org/dwc/terms/collectionID | String |
institutionCode | The name (or acronym) in use by the institution having custody of the object(s) or information referred to in the record. | http://rs.tdwg.org/dwc/terms/institutionCode | String |
collectionCode | The name, acronym, coden, or initialism identifying the collection or data set from which the record was derived. | http://rs.tdwg.org/dwc/terms/collectionCode | String |
basisOfRecord | The specific nature of the data record. | http://rs.tdwg.org/dwc/terms/basisOfRecord | Enum |
occurrenceID | An identifier for the Occurrence (as opposed to a particular digital record of the occurrence). In the absence of a persistent global unique identifier, construct one from a combination of identifiers in the record that will most closely make the occurrenceID globally unique. | http://rs.tdwg.org/dwc/terms/occurrenceID | ID |
recordedBy | A list (concatenated and separated) of names of people, groups, or organizations responsible for recording the original Occurrence. The primary collector or observer, especially one who applies a personal identifier (recordNumber), should be listed first. | http://rs.tdwg.org/dwc/terms/recordedBy | String |
sex | The sex of the biological individual(s) represented in the Occurrence. | http://rs.tdwg.org/dwc/terms/sex | String |
lifeStage | The age class or life stage of the biological individual(s) at the time the Occurrence was recorded. | http://rs.tdwg.org/dwc/terms/lifeStage | String |
occurrenceRemar ks | Comments or notes about the Occurrence. | http://rs.tdwg.org/dwc/terms/occurrenceRemarks | String |
eventDate | The date-time or interval during which an Event occurred. For occurrences, this is the date-time when the event was recorded. Not suitable for a time in a geological context. | http://rs.tdwg.org/dwc/terms/eventDate | DateType |
year | The four-digit year in which the Event occurred, according to the Common Era Calendar. | http://rs.tdwg.org/dwc/terms/year | Int |
month | The ordinal month in which the Event occurred. | http://rs.tdwg.org/dwc/terms/month | Int |
day | The integer day of the month on which the Event occurred. | http://rs.tdwg.org/dwc/terms/day | Int |
habitat | A category or description of the habitat in which the Event occurred. | http://rs.tdwg.org/dwc/terms/habitat | String |
higherGeography | A list (concatenated and separated) of geographic names less specific than the information captured in the locality term. | http://rs.tdwg.org/dwc/terms/higherGeography | String |
country | The name of the country or major administrative unit in which the Location occurs. | http://rs.tdwg.org/dwc/terms/country | String |
countryCode | The standard code for the country in which the Location occurs. | http://rs.tdwg.org/dwc/terms/countryCode | String |
stateProvince | The name of the next smaller administrative region than country (state, province, canton, department, region, etc.) in which the Location occurs. | http://rs.tdwg.org/dwc/terms/stateProvince | String |
county | The full, unabbreviated name of the next smaller administrative region than stateProvince (county, shire, department, etc.) in which the Location occurs. | http://rs.tdwg.org/dwc/terms/county | String |
municipality | The full, unabbreviated name of the next smaller administrative region than county (city, municipality, etc.) in which the Location occurs. Do not use this term for a nearby named place that does not contain the actual location. | http://rs.tdwg.org/dwc/terms/municipality | String |
locality | The specific description of the place. Less specific geographic information can be provided in other geographic terms (higherGeograph y, continent, country, stateProvince, county, municipality, waterBody, island, islandGroup). This term may contain information modified from the original to correct perceived errors or standardize the description. | http://rs.tdwg.org/dwc/terms/locality | String |
verbatimLocalit y | The original textual description of the place. | http://rs.tdwg.org/dwc/terms/v erbatimLocality | String |
minimumElevatio nInMeters | The lower limit of the range of elevation (altitude, usually above sea level), in meters. | http://rs.tdwg.org/dwc/terms/minimumElevation InMeters | Int |
maximumElevatio nInMeters | The upper limit of the range of elevation (altitude, usually above sea level), in meters. | http://rs.tdwg.org/dwc/terms/maximumElevation InMeters | Int |
minimumDepthInM eters | The lesser depth of a range of depth below the local surface, in meters. | http://rs.tdwg.org/dwc/terms/minimumDepthInMe ters | Int |
maximumDepthInM eters | The greater depth of a range of depth below the local surface, in meters. | http://rs.tdwg.org/dwc/terms/maximumDepthInMe ters | Int |
decimalLatitude | The geographic latitude (in decimal degrees, using the spatial reference system given in geodeticDatum) of the geographic center of a Location. Positive values are north of the Equator, negative values are south of it. Legal values lie between -90 and 90, inclusive. | http://rs.tdwg.org/dwc/terms/decimalLatitude | Float |
decimalLongitud e | The geographic longitude (in decimal degrees, using the spatial reference system given in geodeticDatum) of the geographic center of a Location. Positive values are east of the Greenwich Meridian, negative values are west of it. Legal values lie between -180 and 180, inclusive. | http://rs.tdwg.org/dwc/terms/decimalLongitude | Float |
geodeticDatum | The ellipsoid, geodetic datum, or spatial reference system (SRS) upon which the geographic coordinates given in decimalLatitude and decimalLongitud e as based. | http://rs.tdwg.org/dwc/terms/geodeticDatum | Float |
coordinateUncer taintyInMeters | The horizontal distance (in meters) from the given decimalLatitude and decimalLongitud e describing the smallest circle containing the whole of the Location. Leave the value empty if the uncertainty is unknown, cannot be estimated, or is not applicable (because there are no coordinates). Zero is not a valid value for this term. | http://rs.tdwg.org/dwc/terms/coordinateUncertaintyInMeters | Int |
coordinatePreci sion | A decimal representation of the precision of the coordinates given in the decimalLatitude and decimalLongitud e. | http://rs.tdwg.org/dwc/terms/coordinatePrecision | Float |
identifiedBy | A list (concatenated and separated) of names of people, groups, or organizations who assigned the Taxon to the subject. | http://rs.tdwg.org/dwc/terms/identifiedBy | String |
dateIdentified | The date on which the subject was identified as representing the Taxon. | http://rs.tdwg.org/dwc/terms/dateIdentified | DateType |
taxonID | An identifier for the set of taxon information (data associated with the Taxon class). May be a global unique identifier or an identifier specific to the data set. | http://rs.tdwg.org/dwc/terms/taxonID | ID |
scientificNameI D | An identifier for the nomenclatural (not taxonomic) details of a scientific name. | http://rs.tdwg.org/dwc/terms/scientificNameID | TaxonType |
scientificName | The full scientific name, with authorship and date information if known. When forming part of an Identification, this should be the name in lowest level taxonomic rank that can be determined. This term should not contain identification qualifications, which should instead be supplied in the IdentificationQ ualifier term. | http://rs.tdwg.org/dwc/terms/scientificName | String |
acceptedNameUsa ge | The full name, with authorship and date information if known, of the currently valid (zoological) or accepted (botanical) taxon. | http://rs.tdwg.org/dwc/terms/acceptedNameUsag e | String |
kingdom | The full scientific name of the kingdom in which the taxon is classified. | http://rs.tdwg.org/dwc/terms/kingdom | String |
phylum | The full scientific name of the phylum or division in which the taxon is classified. | http://rs.tdwg.org/dwc/terms/phylum | String |
class | The full scientific name of the class in which the taxon is classified. | http://rs.tdwg.org/dwc/terms/class | String |
order | The full scientific name of the order in which the taxon is classified. | http://rs.tdwg.org/dwc/terms/order | String |
family | The full scientific name of the family in which the taxon is classified. | http://rs.tdwg.org/dwc/terms/family | String |
genus | The full scientific name of the genus in which the taxon is classified. | http://rs.tdwg.org/dwc/terms/genus | String |
subgenus | The full scientific name of the subgenus in which the taxon is classified. Values should include the genus to avoid homonym confusion. | http://rs.tdwg.org/dwc/terms/subgenus | String |
specificEpithet | The name of the first or species epithet of the scientificName. | http://rs.tdwg.org/dwc/terms/specificEpithet | String |
infraspecificEpithet | The name of the lowest or terminal infraspecific epithet of the scientificName, excluding any rank designation. | http://rs.tdwg.org/dwc/terms/infraspecificEpi thet | String |
taxonRank | The taxonomic rank of the most specific name in the scientificName. | http://rs.tdwg.org/dwc/terms/taxonRank | Enum |
taxonomicStatus | The status of the use of the scientificName as a label for a taxon. Requires taxonomic opinion to define the scope of a taxon. Rules of priority then are used to define the taxonomic status of the nomenclature contained in that scope, combined with the experts opinion. It must be linked to a specific taxonomic reference that defines the concept. | http://rs.tdwg.org/dwc/terms/taxonomicStatus | Enum |
Dataset (DatasetType)
Campo | Descripción | URL | Tipo |
---|---|---|---|
id | A property reserved for globally unique identifiers. | ID | |
title | a title or label for the resource. | String | |
description | A description for the resource. Markdown formated | String | |
person_in_charg e | UserType | ||
licenses | Enum | ||
path | The location of resource data. | URL | |
temporal | Temporal characteristics of the resource data | http://purl.org /dc/terms/tempo ral | TemporalType |
spatial | Spatial characteristics of the resource data. | http://purl.org /dc/terms/spati al | BoxEncodingType |
keywords | An Array of string keywords to assist users searching for the package in catalogs. | [String] |