This project needs a method for citing data that has been uploaded. Some of the reasons we need a citation format are:
- Recognition of a person's work.
- If a paper uses data, then it should be referenced and the reader should be able to find it easily.
- Recognition of the collective work of the entire group.
Note that the first item has several aspects, such as personal satisfaction, status in the research community, obtaining future grants, and evaluation for tenure and promotion.
The requirements
The third item above, and some aspects of the other items too, make it clear that some "branding" is required. This project needs a name that (we hope) will eventually become known. Examples are arXiv, AIM, and Sage.
To satisfy the second item above, for the near future, we will need to supplement our citation format to contain a URL. I think it is important to adopt the mindset that this is just temporary until this project becomes widely known.
For most aspects of the first item, peer review is necessary to achieve proper recognition.
Versioning: the arXiv just numbers versions, but Mike wants the citation to make it clear how new the data is. A proposal is below.
Three cases
I suggest that it is helpful to consider 3 cases:
- The uploaded data was produced from or for a specific research paper.
- The person who uploaded the data also wrote a note on the data, but that note is not intended for publication in a journal.
- The uploaded data is not associated to any specific document.
In the first two cases, I think we agree that both the paper and the data should be cited, possibly in the format
Cite this work as: D. Farmer and S Lemurell, Deformations of Maass forms, Math Comp 74 (2005), no. 252, 1967--1982. Data available at AAAAAAAAAA BBBBBBB:CCCCC
We need to make it clear how to attach author names to data that is not associated to a paper.
The problems to be solved, and some proposed solutions
I hope this is a complete list of the specific problems we have to address, not in any order:
- Find a sexy name for the project (at most 3 words long, preferably with a nice acronym).
- Move this project towards maturity: have a visible advisory board, etc. Proposal: think about a peer review system, so that data can have a status similar to a journal article. The "Journal of Mathematical Data" with editors, etc?
- Reach consensus that the ultimate citation method has a format similar to: sexyname:CODENUMBERvVERSION
- Decision on previous point: yyyymm.####vzzzzl where yyyy is the original year, mm the original month, #### a sequential number, v is a v, zzzz is the current year, l is the version letter; e.g. 200806.0231v2009c was first submitted in June 2008, the current version is the third version submitted in 2009
- Decide on the (hopefully temporary) naming scheme for URL citations. Proposal 1: that URL does not point all the way down to the specific data, but it does point to something more specific than whatever.org. Proposal 2: we just append the URL to out ultimate citation format, so that it is easy to drop later.
- Develop a Technical Reports aspect of the project. Proposal: The Technical Reports will be peer reviewed.
- Decide how to attach "author" names to data that does not have an associated paper or report.
Decide on a (presumably similar to ArXiv style) system for naming with a short string. Proposal: use a system almost identical to arXiv
- Decide how to incorporate version number into the name. Proposal: If X is the name, then the full name including the version would look like this, where version are given by the year followed by a letter which is incremented for each new version in that year:
- *Xv2006c
- *Xv2007a
