Idescat > Developers Area > API > Onomastics

Onomastics. API RSS

Terms of use
05.06.2013

This API provides access to statistical information about names and surnames of the population and names of children. Idescat disseminates this information in the Onomastics section.

The use of this service requires acceptance of the Terms of use of Idescat's APIs.

Summary
Base URI http://api.idescat.cat/onomastica/v1/{subservice}/{operation}.{format}[?parameters]
HTTP Method GET
Response Formats xml, json, php, txt
Version 1.00 (05.06.2013)
Shortcuts Request, Response
Operations dades, cerca, sug
Widgets that use this service Names in Catalonia

1. Request

1.1. Basic characteristics

Every request to the Onomastics API must by obligation specify the service (onomastica), version, subservice, operation and format. Version, subservice (when there is any) and operation are specific characteristics of each service. Besides the formats for general responses of Idescat APIs, the service (onomastica) supports the text format in operation sug. For more information, see the Anatomy of requests sections of the general documentation on Idescat APIs.

1.1.1. Identifier of service and version and admitted subservices

The identifier of this service is onomastica.

http://api.idescat.cat/onomastica/v1/{…}

It accepts three subservices:

  • noms: data of given names of the population
  • cognoms: data of surnames of the population
  • nadons: data of given names of the newborns

You must specify one subservice.

http://api.idescat.cat/onomastica/v1/nadons/{…}

1.1.2. Operations

This service admits three types of operation:

  • dades: Returns statistics for a given name or surname.
    http://api.idescat.cat/onomastica/v1/nadons/dades.{…}
  • cerca: Returns search results by name or surname fragment.
    http://api.idescat.cat/onomastica/v1/noms/cerca.{…}
  • sug: Returns the alphabetical list of names or surnames beginning with the specified characters.
    http://api.idescat.cat/onomastica/v1/cognoms/sug.{…}

1.2. Specific parameters

This service supports the general parameters (language and encoding) of Idescat APIs.

The specific parameters make it possible to choose the information that will be returned by a certain operation of the service. These can be specified as individual parameters or as a unique p parameter (compact form).

1.2.1. Operation dades

The dades operation returns statistics for a given name or surname.

1.2.1.1. Filter id

This required parameter is used to specify the given name (in the case of the noms and nadons subservices) or the surname (in the case of the cognoms subservice) of the statistics that you want to retrieve.

Ex. 1: Statistical information from the latest available year of women born with name MARIA (40683)
http://api.idescat.cat/onomastica/v1/nadons/dades.xml?id=40683&lang=en
Ex. 2: Statistical information from the latest available year of men born with name MARIA (39969)
http://api.idescat.cat/onomastica/v1/nadons/dades.xml?id=39969&lang=en

The cerca operation can be used to obtain the identifier of a certain name or surname.

The id parameter also allows you to specify the name or surname literal (without accents) as recorded in the statistics of onomastics. However, it should be noted that when a literal is specified in the id parameter, it cannot be combined with any other parameter.

Ex. 3: Statistical information from the latest available year of newborns named MARIA
http://api.idescat.cat/onomastica/v1/nadons/dades.xml?id=MARIA&lang=en

As MARIA can be a boy and girl name, the previous call will contain separate statistics for each sex (40683, 39969).

1.2.1.2. Filter geo

The geo filter makes it possible to filter the results geographically. To select the place of residence all subservices accept the following prefixes:

  • prov: province
  • at: areas of the Territorial Plan
  • com: county
Ex. 4: Statistical information of people with surname GARCIA (285413) in the Alt Camp population (COM:01) in the last available year
http://api.idescat.cat/onomastica/v1/cognoms/dades.xml?id=285413&geo=com:01&lang=en
Ex. 5: Statistical information of women named MARIA (40683) in the Comarques Gironines population (AT:02) in the last available year
http://api.idescat.cat/onomastica/v1/noms/dades.xml?id=40683&geo=at:02&lang=en

The class parameter can be used to find out the codes used in the different geographical areas.

1.2.1.3. Filter t

The t parameter can be used in all subservices, except cognoms, to filter the statistics by time. In the case of the nadons subservice, this parameter indicates the year of birth.

Ex. 6: Statistical information of women born in 2009 with the name MARIA (40683)
http://api.idescat.cat/onomastica/v1/nadons/dades.xml?id=40683&t=2009&lang=en

The class parameter can be used to find out the available years in this subservice. If t is not specified or has an incorrect value, the nadons subservice returns statistics from the latest available year.

In the case of the noms and cognoms subservices, statistics from the latest Population Register of Catalonia are always returned. In the case of the noms, though, the t parameter can be used to filter the population by decade of birth.

Ex. 7: Statistical information on the population of women born between 1990 and 1999 named MARIA (40683)
http://api.idescat.cat/onomastica/v1/noms/dades.xml?id=40683&t=1990-1999&lang=en

In the subservices that accept the t filter, it can be combined with the geo filter.

Ex. 8: Statistical information on women born in Baix Llobregat (com:11) in 2005 named MARIA (40683)
http://api.idescat.cat/onomastica/v1/nadons/dades.xml?id=40683&t=2005&geo=com:11&lang=en

The class parameter can be used to find out the valid values in the noms and cognoms subservices. If t is not specified or has an incorrect value, these two subservices return statistics of the entire population from the latest available year.

1.2.1.4. Filter nac

In the noms subservice, the nac parameter can be used to filter statistics by citizenship. This parameter cannot be combined with other filters.

Ex. 9: Statistical information on women named MARIA (40683) with Polish citizenship
http://api.idescat.cat/onomastica/v1/nadons/dades.xml?id=40683&nac=pl&lang=en

The nac parameter uses the country codes of ISO 3166-1 alpha-2 standard. You can use the class parameter to retrieve the countries available. If nac is not specified or has an incorrect value, the noms subservice returns statistics of the entire population from the latest available year.

1.2.1.5. Parameter class

The class parameter lets you segment the results. The classification criteria depend on each subservice and are:

  • prov: by provinces
  • at: by areas of the Territorial Plan
  • com: by counties
  • nac: by citizenship (only available in the noms subservice)
  • t: in the case of noms, por década de nacimiento; en el caso de nadons by decade of birth; in the case of nadons by year of birth. The cognoms subservice does not support this value.
Ex. 10: Statistical information on women named MARIA (40683) in the last available year by counties
http://api.idescat.cat/onomastica/v1/noms/dades.xml?id=40683&class=com&lang=en
Ex. 11: Statistical information on women named MARIA (40683) by year of birth
http://api.idescat.cat/onomastica/v1/nadons/dades.xml?id=40683&class=t&lang=en

The class parameter can be combined with filters.

Ex. 12: Statistical information for women born in Baix Llobregat (com:11) named MARIA (40683) by year of birth
http://api.idescat.cat/onomastica/v1/nadons/dades.xml?id=40683&class=t&geo=com:11&lang=en

In case of conflict between classification variables and filters (for example, class=t&t=2005), filters have priority and class will be ignored.

1.2.2. Operation cerca

The cerca operation provides search results ordered by name or surname fragment. If there are too many results, these are paged and you must use the posicio parameter to obtain for different pages of results.

1.2.2.1. Filter q

It allows to filter the results by a name or surname fragment. If q is specified, the results are sorted by similarity (first the names or surnames that exactly match the search; then the name or surnames of more than one word including the word search, and finally, the names or surnames containing the searched text as fragment) and frequency.

Ex. 13: Statistics of the first names that contain MARIA sorted by similarity and frequency
http://api.idescat.cat/onomastica/v1/noms/cerca.xml?q=maria&lang=en

If no search string (q) or sort criteria (orderby, desc) are specified, results are returned in ascending order by frequency. When q is not specified, names or surnames with a frequency lower than 4 are not included in the results (in the case of surnames, they are only include those with a frequency equal o higher than 4 in both the first surname and last surname).

Ex. 14: Statistics on surnames from lowest to highest frequency
http://api.idescat.cat/onomastica/v1/cognoms/cerca.xml?lang=en
1.2.2.2. Filter sim

The results of the cerca operation are associated with a category of similarity to the searched string. The sim parameter lets you filter the results according to this similarity:

  • 0: returns only the names or surnames that exactly match the specified q string.
  • 1: returns only the names or surnames that contain the q string as a word (or expression) and are not included in the previous section.
  • 2: returns only the names or surnames that contain the q string as a fragment and are not included in the previous sections.

These values can be concatenated separated by commas. By default, sim is equal to 0, 1, 2.

Ex. 15: Statistics of newborns named exactly MARIA
http://api.idescat.cat/onomastica/v1/nadons/cerca.xml?q=maria&sim=0&lang=en
Ex. 16: Statistics of newborns named exactly MARIA or that have a compound name with MARIA
http://api.idescat.cat/onomastica/v1/nadons/cerca.xml?q=maria&sim=0,1&lang=en

If q has not been specified, the sim is ignored.

1.2.2.3. Parameters orderby and desc

If q is specified, the results are sorted by proximity (sim) and frequency. When q is not specified, the orderby and desc parameters lets you determine the sorting of the results.

  • orderby
    • v: The results are sorted by frequency of appearance of name or surname, and then alphabetically in ascending order.
    • nom: The results are sorted alphabetically by name or surname, and then from the highest to the lowest frequency.
    • sex: In the case of the noms and nadons subservices, the results are sorted by sex (men, women), and then from the highest to the lowest frequency.
  • desc
    • 0: Ascending order of the criterion established by orderby. It is the default value.
    • 1: Descending order of the criterion established by orderby.

It is recommended to always pass explicitly the orderby and desc parameters.

Ex. 17: Statistics of most frequent names of newborns in Catalonia
http://api.idescat.cat/onomastica/v1/nadons/cerca.xml?orderby=v&desc=1&lang=en
Ex. 18: Statistics of most common names for girls born in Catalonia
http://api.idescat.cat/onomastica/v1/nadons/cerca.xml?orderby=sex&desc=1&lang=en
Ex. 19: Statistics of surnames in alphabetical order
http://api.idescat.cat/onomastica/v1/cognoms/cerca.xml?orderby=nom&lang=en

If q have been specified, the orderby and desc parameters are ignored.

1.2.2.4. Parameter posicio

As some searches can return a very large number of results, the response does not always include them all. The parameter posicio determines which is the first result included in the response. By default, posicio is equal to zero (results are returned starting with the first).

Ex. 20: Statistics of most common names of the population from position 26
http://api.idescat.cat/onomastica/v1/noms/cerca.xml?orderby=v&desc=1&posicio=25&lang=en
1.2.2.5. Filter geo

The cerca operation accepts the geo parameter, which has the same meaning and restrictions that in the dades operation (see section 1.2.1.2).

Ex. 21: Latest statistics of most common names in the Alt Camp population (com:01)
http://api.idescat.cat/onomastica/v1/noms/cerca.xml?orderby=v&desc=1&geo=com:01&lang=en
Ex. 22: Statistics of newborns in Alt Camp (com: 01) named exactly Maria
http://api.idescat.cat/onomastica/v1/nadons/cerca.xml?q=maria&sim=0&geo=com:01&lang=en
Ex. 23: Statistics of most common surnames in Alt Camp (com:01)
http://api.idescat.cat/onomastica/v1/cognoms/cerca.xml?orderby=v&desc=1&geo=com:01&lang=en
Ex. 24: Statistics of most common surnames in Alt Camp (com:01) including EZ as a fragment
http://api.idescat.cat/onomastica/v1/cognoms/cerca.xml?q=ez&sim=2&orderby=v&desc=1&geo=com:01&lang=en
1.2.2.6. Filter t

The cerca operation accepts the t parameter, which has the same meaning and restrictions that in the dades operation (see section 1.2.1.3).

Ex. 25: Statistics of most common names of the population born in the eighties
http://api.idescat.cat/onomastica/v1/noms/cerca.xml?orderby=v&desc=1&t=1980-1989&lang=en
Ex. 26: Statistics of most common names of the population born in the eighties in Alt Camp (com:01)
http://api.idescat.cat/onomastica/v1/noms/cerca.xml?orderby=v&desc=1&t=1980-1989&geo=com:01&lang=en
Ex. 27: Statistics of most common names in 2010 of newborns
http://api.idescat.cat/onomastica/v1/nadons/cerca.xml?orderby=v&desc=1&t=2010&lang=en
Ex. 28: Statistics of most common names in 2010 of newborns in Alt Camp (com:01)
http://api.idescat.cat/onomastica/v1/nadons/cerca.xml?orderby=v&desc=1&t=2010&geo=com:01&lang=en
1.2.2.7. Filter nac

The cerca operation admits the nac parameter, which has the same meaning and restrictions that in the dades operation (see section 1.2.1.4).

Ex. 29: Statistics of most common names of the population with Polish citizenship
http://api.idescat.cat/onomastica/v1/noms/cerca.xml?orderby=v&desc=1&nac=pl&lang=en
Ex. 30: Statistics of people with Polish citizenship exactly named ANNA
http://api.idescat.cat/onomastica/v1/noms/cerca.xml?q=anna&sim=0&nac=pl&lang=en

1.2.3. Operation sug

The sug operation returns an alphabetical list of names or surnames available in the database that begin with the characters specified in the required q parameter. Its presence on the list does not indicate that it appears in the latest year available.

Besides the general formats (XML, JSON, serialized PHP), this operation supports the TXT format.

1.2.3.1. Filter q

This parameter is mandatory and is used to specify the first few characters of a name or surname.

Ex. 31: Names of the population beginning by MART
http://api.idescat.cat/onomastica/v1/noms/sug.txt?q=mart&lang=en

To limit the size of the response, it is recommended to specify a q string of at least three characters.

1.3. Invocation without operation

For reasons of friendliness, this API supports requests with a syntax that does not require specifying operations (see, in the general documentation of the Idescat APIs, section 1.4. Invocation without operation).

  • Statistics of name 40683 (MARIA woman) in the population of Catalonia in XML format
    http://api.idescat.cat/onomastica/v1/noms/40683.xml?lang=en
  • Statistics of name MARIA in the population of Catalonia in JSON format
    http://api.idescat.cat/onomastica/v1/noms/maria.json?lang=en
  • Statistics of surnames including MART in the population of Catalonia in JSON format
    http://api.idescat.cat/onomastica/v1/cognoms.json?q=mart&lang=en
  • Names of newborns beginning with MART in TXT format
    http://api.idescat.cat/onomastica/v1/nadons.txt?sug=mart&lang=en

2. Response

In order to find out the HTTP response codes returned and the supported by any service, see section 2 of Idescat APIs.

In general, when a value cannot be offered for reasons of statistical confidentiality it will be indicated by a _.

2.1. Operation dades

Each subservice has its own container (onomastica_noms, onomastica_cognoms, onomastica_nadons) and the following common general elements:

  • c: concept.
  • r: time reference.
  • geo: geographical reference.
  • s: source.
  • updated: date of update.
  • l: link to a related table at Idescat's website.
Ex. 32: Common general elements of the dades operation
<onomastica_noms version="1.00" lang="en" o="dades" n="47" p="id=40683;class=nac">
   <c sex="f">MARIA</c>
   <r>2012</r>
   <geo id="09" scheme="ca">Catalunya</geo>
   <s>Idescat, based on the Catalonia's Population Register.</s>
   <updated>2012-11-13T11:00:00+00:00</updated>
   <l>http://www.idescat.cat/noms/?geo=4&id=40683&lang=en</l>
   …
</onomastica_noms>

The results are included in an ff element with many rows (f) as possible values. The element c describes the row.

Ex. 33: Statistics of names by citizenship
<onomastica_noms version="1.00" lang="en" o="dades" n="47" p="id=40683;class=nac">
   …
   <ff>
      <f>
         <c id="es" scheme="nac">Spain</c>
         …
      </f>
      <f>
         <c id="ro" scheme="nac">Romania</c>
         …
      </f>
      <f>
         <c id="ru" scheme="nac">Rusia</c>
         …
      </f>
      …
      <f>
         <c>Total</c>
         …
      </f>
   </ff>
</onomastica_noms>
Ex. 34: Statistics of the name MARIA
<onomastica_noms version="1.00" lang="en" o="dades" n="2" p="id=MARIA">
   <c>MARIA</c>
   <r>2012</r>
   <geo id="09" scheme="ca">Catalunya</geo>
   <s>Idescat, based on the Catalonia's Population Register.</s>
   <updated>2012-11-13T11:00:00+00:00</updated>
   <ff>
      <f>
         <c id="40683" sex="f">MARIA</c>
         …
      </f>
      <f>
         <c id="39969" sex="m">MARIA</c>
         …
      </f>
   </ff>
</onomastica_noms>

Data are included in the following elements:

  • rank:
    • total: ranking position in all names regardless of sex.
    • sex: ranking position in all names of the same sex
  • pos1:
    • v: frequency of occurrence of the name, or surname as a first surname.
    • w:
      • total: weight per thousand of total of names or first surnames.
      • sex: weight per thousand of total of names of the same sex.
  • pos2:
    • v: frequency of occurrence of surname as a second surname.
    • w:
      • total: weight per thousand of total of names or second surnames.

The cognoms subservice does not include any sex element. The noms and nadons subservices do not include the pos2 element.

2.2. Operation cerca

The cerca operation uses the Atom standard to describe the results, enhanced with elements of the OpenSearch standard.

The entries (entry) are extended with the f element as described in the dades operation.

2.3. Operation sug

The sug operation uses the OpenSearch Suggestions standard to describe the results, which specifies a response in JSON. This same structure is used in the response when the serialized PHP format is requested.

The response in the XML format includes the same structured information according to the XML Search Suggestions Format (sometimes called OpenSearch SearchSuggest2).

For convenience, the response in text format is also available, with the results separated by line breaks.

2.4. Errors

The Idescat APIs use standardised response codes to show whether the request has been successful or has failed.

05.06.2013