|
To (un)select a group of files belonging to the same folder, click on the checkbox to the right of this folder's name. The ELAN icon in front of each filename (for those who have the authorization) allows the downloading of the corresponding ELAN file (right click on it, then choose 'save the target as...') |
|
|
One can access the list of occurrences of an item by clicking on its value.
From that page, it's possible to
|
|
The Concordance button creates a list of the words matching the regular expression given in the (Part of) word concordances box, with their left and right contexts. The words matching the regular expression are centered into the page.
The prosodic units where these words appear can be displayed (then listened to) one by one by clicking on their corresponding identifier. (Here, the concordance of the word 'o:mhi:n', then the displaying of the prosodic unit BEJ_MV_NARR_02_farmer_313) The unit display can then be enlarged on both sides by giving the number of units desired on the left and on the right, then by clicking the Extend display button. |
The query engine used for the CorpAfroAs corpus is based on the mfSearch package from the Max Planck Institute for Psycholinguistics, Nijmegen. The query interface reproduces ELAN software's one. Two complementary functions have been added. Here is a description of the form
The Search window presents two areas: By default -> case sensitive : uppercase and lowercase are not equivalent -> regular expression : how the search targets and contexts must be interpreted (cf. bottom of the page) -> minimal duration : search only in units of this minimal duration (0 = any duration) -> maximal duration : search only in units of this maximal duration (0 = any duration) -> Target : searched sequence. Then don't forget to specify, at the right of the layer, the tier (or tier type) where you want to search for this sequence (morpheme, word, gloss or category tiers...). (In the screenshot above, one searches for the label 'OBL' in tiers of 'ge' type) |
|
Multiple layer search
You can refine your initial search by adding constraints in the layer below. In this case, you will have to choose the type of constraint you want to impose to the targets.
|
For example, after having found 1566 morphemes tagged as 'demonstrative' in the corpus ('DEM' in 'ge' type tiers), one would like to know how many are of 'proximal' type ('PROX' in 'rx' type tiers).
For example, to find the nouns ('N' in 'rx' type tiers) directly followed by a determiner ('DET' in 'rx' type tiers), one will search:
= 0 would mean 'in the same annotation' < 2 between Left Context and Target would mean 'with an annotation containing the left target sequence at a distance of zero or one annotation at left of the target one -> The Clear button clears the form content -> The Find button launches the search |
Regular expressions | |
Regular expressions provide flexible means to match strings of text like 'beginning with', 'ending with', 'any from a list'... By default, the sequences in target or context boxes are searched inside the annotations. Then, for example, searching label 'PFV' will retrieve also the annotations labelled 'IPFV', 'IPFV.3SG.F'... | |
\b | is a mark for a word frontier (beginning, end, ponctuations). ex: \bIPF\b = only the 'IPFs' inside an annotation (complex or not) |
^ | means beginning of the text. ex: ^N = all annotations beginning by N; ^- = all annotations that are suffixes (in this corpus, prefixes present a hyphen to the right, suffixes, a hyphen to the left) |
$ | end of text. ex: -$ = all annotations that are prefixes |
. | any single character. (if nothing after, it will be interpreted as 'any sequence of characters') |
\. or [.] | the character '.' |
? | the previous character or no character. e.g.: 'gr?ave' will match 'gave' and 'grave' |
[?] | the character '?' |
+ | the previous character at least one time. e.g.: 'me+t' will match 'met', 'meet'... |
[aeiou] | one of these vowels. e.g.: 'p[aeiou]pe' will match all the words 'pape', 'pipe', 'pepe'... |
[^ptk] | any character but 'p', 't' or 'k' |
[a-h] | any letter between 'a' and 'h' |
NOT() | annotation not containing the text between parenthesis. ex: NOT(\.) in rx or ge = the plain annotations (not complex, i.e without '.') |