Team Giovanni/Patrizia

From MarineLives
Revision as of 06:34, August 31, 2012 by ColinGreenstreet (Talk | contribs)

Jump to: navigation, search

Team Giovanni/Patrizia

Team Colin

Editorial history

23/08/12: CSG, created page






Suggested links


Team Colin
Team Jill
Team William

TEI: Text Encoding Initiative
TEI Lite



Tasks for the week



Week commencing 20th August 2012




Week commencing 30th August 2012


- Patrizia, how should we deal with quantities and currencies database wise? <quantity value="hour">6. howers</quantity> OR <quantity value="hour">6</quantity>. howers ? I suggested the latter.
Giovanni:I agree

- We'll need to decide which elements we want in the header, to mimic some of a TEI one. If you manage to give a thought about this, it would be great.
I'm afraid I'm not familiar with the structure of the papers until now. Which are the elements that we want to include?

- One further concern: units of meaning in the text (depositions, cases, etc.). We need to identify them properly I think. Followup: we should probably have one header per document (ie picture), and a separate header for cases, depositions (each case contains many, scattered across different pages). Ideally we'll have some metadata to associate the document with a picture, but we'll also represent the units of meaning of the source, which is paramount.
Patrizia: isn't this something that should mimic the structure of the national archives? (I mean the documentary unit)



Week commencing 3rd September 2012




Useful email records

Patrizia to Giovanni, Colin 30/08/12: 23:25


I tried to catch up some issues,probably in an untidy way. I hope something is useful.
Giovanni, tell me if you read my comments in our area. I'm not clear how it works.
I'll try to make an example of excel file for tomorrow, using one of the transcriptions of Colin.



Patrizia to Giovanni, Colin, Charlene, Stuart, Jill, William: 30/08/12: 23:22


PATRIZIA: Only some thoughts about some (not all) of the questions. As Colin said, I am travelling, and see pages only in a very limited and uncomfortable way. Sorry if some comment is unclear.



COLIN: For example:

The button ship: do we highlight "fortune", or "the fortune", or "the shipp the fortune". Does it matter? Giovanni, the "ship" category is currently not displaying in colour the HTML driven publication of the transcribed page at the bottom (it displays as e.g. "the said shipp < style="color:blue">fortune")

PATRIZIA: Yes, it matters. If you include the article in the tag, everything will be sorted under 'the'. Think how the names are painted on true ships: it's likely 'Fortune', not 'The Fortune.' It's the same problem I already highlighted in my last email about 'said' before the name of persons



COLIN: The button "person": do we highlight only personal names, or do we include clear individuals such as "the king of Spain", or should it be "King of Spain".

PATRIZIA: Idem: 'King of Spain', and absolutely yes, include him (as it is). Then it is a problem of the database to give him a name, like here: We saw the king and queen during Mass. It's from Mozart's letters.



If you hover with the mouse over king and queen you read 'Ferdinando IV di Borbone Napoli, born 12/01/1752, died 04/01/1825' and Maria Caroltta (Carolina) d'Asburgo-Lorena - born 13/08/1752, died 07/09/1814.



COLIN: Another example would be the Lord Protector, which I have marked up as The Lord Protector (The <person>Lord Protector </person>against).

PATRIZIA: Sorry: be careful again with spaces within the tags.

This is being used as a name, despite being a title (or is it an occupation). Once we have the occupation button, would we mark this up in preference as an occupation, or is it both a name and an occupation and requires double markup? If it requires double markup, is there a syntax which requires one to come within the others? (I don't see any brackets or other syntactic like devices being generated in the HTML code)

No, I would say that this is only a way to refer to the person. It is not the same case of



COLIN: <person>Charles Anquestil</person>, <profession>Mariner</profession> and <profession>Gunner</profession>

PATRIZIA: In this case 'mariner' is a predicate of the person, and indicates his profession (i.e.: I find your mark correct), with the already mentioned warning that I would mark 'gunner' with a tag 'role' or 'title', as you find best (after all I find that 'role' would better suit).
If I may take again an example from Mozart's letter, see this:

http://letters.mozartways.com/index.php?lang=eng&theme=people&name=1200&alpha=C

You see that Antonio Colonna Branciforte is mentioned in letter 171. If you click on 'View', you will see the term 'Cardinal' highlighted within the text of letter 171.

'Cardinal' is a way to refer to the 'person' Antonio Colonna Branciforte, who had, in time,different roles:

1. Assistente al Soglio Pontificio (27/02/1754)
2. Nunzio Apostolico a Venezia (02/04/1754)
3. Cardinale (06/04/1766)
4. Legato Pontificio a Bologna (1769 — 1775)



COLIN: Would "King of Spain" be marked up both as a person and "Spain" as a place, or is this a clear case of where the transcriber is distinguishing person and place?

PATRIZIA: No, again: King of Spain is only a way to refer to that person.Semantically, it does not have reference to a place.



COLIN: In the case of "place" I have assumed (i.e. made an editorial policy assumption" that compound places will be marked up twice, e.g. "of Callice in ffrance" is marked up as "of <place>Callice</place> in <place>ffrance</place>".

PATRIZIA:: See my previous email. The document mentions one place, not two. France is again an attribute of Callice. We can tag the countries, if we find it useful. (This is not needed to find out where is Callice, because this is done with the database, but it could be useful in all the cases where ONLY a country is mentioned. Giovanni, what do you think?)



COLIN: If we were wanting to use markup (converted into TEI compliant markup) to drive searches such as How many legal depositions refer to French war ships (as opposed to merchant ships) in the (English) channel, you would need to know that the "the golden Eagle of Callice" and "the Royal Mary" were french ships and were ships of war, and that Callice (presumably Calais), and other ports such as Dinkirk (with its spelling variants) had been grouped for the purpose of the search under the broader term (English) "Channel"

PATRIZIA: I agree that these will likely be normal searches that people will perform on such a website, but this is the task of the database, in my opinion. The lack of a relational structure (versus a textual search) is exactly what do not allow you to correctly answer to these questions. I sent yesterday a paper to Giovanni that very well highlights the problem. Consider the wonderful Old Bailey Proceedings. Despite being a fantastic resource, there is no way to say if two persons with the same name are two persons or a double reference to the same person. This is why for Mozart we use a relational database, and why I think that we should do the same for Marinelives.



COLIN: I am also clear (for discussion) that we need the transcribers first to create a "clean" transcription, without using any category buttons, and that this then needs review and perfection and signoff, before the categories are added. Otherwise palaeographical questions and learning will get all mixed up with category editorial policy, and I think that is a very big ask for the first four weeks of transcription post training. So I think team facilitators, to the extent that they are acting as page editors, will need to take two passes at each page. For discussion please.

PATRIZIA: Colin: this is probably very, very wise. In Italy we say that Rome was not made in a day. It's already so troubling to get through the paleographical issues, that probably you would better concentrate on this. We can easily go back to these discussions after 3/4 weeks, no?


COLIN: I also understand Giovanni's (and Patrizia's) points about the weakness of Scripto being the absence of a cumulative aggregating robust database, which accumulates the category markup, so that at any one time you can inspect all places, people, etc input by trasncribers to date. However, that is clearly not soluble for this project. It does mean that any "nice to have" but NOT essential functionality such as mapping out places referred to in the transcribed documents would have to be done as a one off piecfe of analysis almost certainly on a sample basis. Playing with mapped data looks like it can only really happen after we have that robust database, which will be generated by the markup/analysis team

PATRIZIA:Totally agree.



COLIN: Finally, it is clear to me that we should move any planned end of project conference from the tentative end of January to a tentative end of March or early April, to give us LOTS of wiggle time on the database/markup/analysis stage of the project

These are my thoughts for what they are worth. I am very keen to hear back as soon as possible from Stuart and Charlene, and also to get William and Jill's input.

I am also going to contact Dr Elaine Murphy today, who I am meeting in Cambridge on Thursday 6th September, to ask her if she would be prepared to take a look at our Scripto modifications and to comment.

Patrizia is leaving today for Austria, and will only be back properly into the conversation Friday week. She and Giovanni are clearly already establishing a very productive relationship, and will jointly be leading the database/semantic markup/analysis team, and will be joint facilitators of that team. I am looking to the two of them, once Patrizia is back, to produce an outline plan for their team, showing broad milestones, and very importantly providing an estimate of the numbers of associates they need on their team, with what sorts of prior experience and training.



Queries to team leaders

Colin