]> git.pld-linux.org Git - packages/dict-gcide.git/blame - dict-gcide-README.DIC
- massive attack s/pld.org.pl/pld-linux.org/
[packages/dict-gcide.git] / dict-gcide-README.DIC
CommitLineData
49dd4b93 1File README.DIC\r
2 To accompany the GNU version of the set of files (cide.*) containing \r
3 the electronic version of the\r
4 Collaborative International Dictionary of English.\r
5 (called also GCIDE)\r
6 These files contain Version 0.46 (January 2002)\r
7 * * * * * * * * * * * * * * * * * * * * * * * * * * * *\r
8\r
9The dictionary was derived from the\r
10 Webster's Revised Unabridged Dictionary\r
11 Version published 1913\r
12 by the C. & G. Merriam Co.\r
13 Springfield, Mass.\r
14 Under the direction of\r
15 Noah Porter, D.D., LL.D.\r
16\r
17and has been supplemented with some of the definitions from\r
18 WordNet, a semantic network created by\r
19 the Cognitive Science Department\r
20 of Princeton University\r
21 under the direction of\r
22 Prof. George Miller\r
23\r
24and is being proof-read and supplemented by volunteers from\r
25around the world. This is an unfunded project, and future\r
26enhancement of this dictionary will depend on the efforts of\r
27volunteers willing to help build this free resource into a\r
28comprehensive body of general information. New definitions\r
29for missing words or words senses and longer explanatory notes, \r
30as well as images to accompany the articles are needed. More\r
31modern illustrative quotations giving recent examples of\r
32usage of the words in their various senses will be very\r
33helpful, since most quotations in the original 1913 dictionary\r
34are now well over 100 years old.\r
35\r
36 This electronic version is being maintained by World Soul,\r
37a non-profit organization in Plainfield, NJ. For additional\r
38information or if you are willing to assist construction of this\r
39data source, contact:\r
40\r
41=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=\r
42 Patrick J. Cassidy | TEL: (908) 561-3416\r
43 World Soul | if no answer, (908) 668-5252\r
44 735 Belvidere Ave. | FAX: (908) 668-5904\r
45 Plainfield, NJ 07062-2054\r
46 pc@worldsoul.org or cassidy@micra.com\r
47=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=\r
48\r
49 * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * \r
50\r
51GCIDE is free software; you can redistribute it and/or modify\r
52it under the terms of the GNU General Public License as published by\r
53the Free Software Foundation; either version 2, or (at your option)\r
54any later version.\r
55\r
56GCIDE is distributed in the hope that it will be useful,\r
57but WITHOUT ANY WARRANTY; without even the implied warranty of\r
58MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the\r
59GNU General Public License for more details.\r
60\r
61You should have received a copy of the GNU General Public License\r
62along with this copy of GCIDE; see the file COPYING. If not, write \r
63to the Free Software Foundation, Inc., 59 Temple Place - Suite 330,\r
64Boston, MA 02111-1307, USA.\r
65 * * * * * * * * * * * * * * * * * * * * * \r
66\r
67STRUCTURE OF THE DICTIONARY\r
68---------------------------\r
69 When the archives are unpacked, the main dictionary text of \r
70the GCIDE will be found in 26 files named "cide.*", where the\r
71asterisk indicates which letter of the alphabet begins the\r
72words in each file. For example, file "cide.b" contains words \r
73beginning with the letter "B". Additional information about the \r
74tagging conventions and special character symbols are contained in \r
75ancillary files in this directory more information below). The main \r
76body of the 1913 dictionary was essentially identical to the edition\r
77published in 1890, and was republished in 1913 with an appendix \r
78containing "New Words". The new words of that appendix have been\r
79integrated into the main file in this version. However, it is important \r
80to keep in mind that the definitions in this dictionary are in most \r
81cases over 100 years old. Use them with caution!\r
82 At the bottom of each paragraph in this dictionary, there is a\r
83bracketed and tagged "source" indicated. This tells from where the\r
84definition or other text in that paragraph came, as follows:\r
85\r
86[<source>1913 Webster</source>]\r
87 = From the original 1890 dictionary.\r
88[<source>Webster 1913 Suppl.</source>]\r
89 = From the 1913 "New Words" supplement to the Webster.\r
90[<source>WordNet 1.5</source>]\r
91 = From the WordNet on-line semantic network.\r
92[<source>Century Dict. 1906.</source>]\r
93 = From the Century Dictionary published in 1906, especially from\r
94 the "proper Names" supplement (volume IX).\r
95 published\r
96[<source>XXX</source>]\r
97 = Added by one of the volunteers.\r
98\r
99 The original definitions have been tagged and in some cases \r
100reformatted or slightly rearranged. If substantive information\r
101is added from a second source, usually the additional source is\r
102also noted, as in:\r
103[<source>Webster 1913 Suppl.</source> + <source>WordNet 1.5</source>]\r
104\r
105 A list of the ancillary files related to the GCIDE is appended at \r
106the bottom of this "readme.dic" file.\r
107 This version is tagged with SGML-like tags of the form <pos>...</pos>\r
108so that the original typography (italics, bold, block quotes) can be\r
109reproduced. A list of the most important tags for fields in the \r
110dictionary is given below. The tags also serve the more important \r
111function of allowing the information content to be conveniently imported\r
112into computer programs or databases. The set of tags used is described \r
113in the accompanying file "tagset.web". ***NOTE*** the paragraph tags\r
114<p>...</p> do *not* always nest properly with certain other tags, such \r
115as <note> and <cs> ("collocation section"), which in some cases span\r
116multiple paragraphs. If you are using a tag parser which detects\r
117improper nesting, you should first either delete the paragraph\r
118tags or convert them to non-tag symbols, or, if possible, set the \r
119parser to ignore the <p>...</p> tags.\r
120 The unusual characters (such as Greek or the European accented\r
121characters, as well as special characters used in the pronunciations)\r
122are described in the accompanying file "webfont.asc". Some information\r
123on the pronunciation system used may be found by viewing the files\r
124"wxxvii.jpg" and "pronunc.jpg" with a GIF viewer (or any web browser),\r
125and additional explanations of pronunciation are in the file \r
126"pronunc.web".\r
127 Each paragraph of the original text is enclosed within tags of \r
128the form <p> . . . </p>. Within these paragraphs are no line\r
129breaks, and some of the paragraphs are over 12,000 characters long.\r
130These lines are too long to be handled by the vi editor, and probably\r
131by some other text editors. At some points, embedded line breaks within \r
132a "paragraph" are marked by a <br/ "entity". The file can therefore\r
133be converted, if necessary, to a form with shorter lines, and subsequently\r
134reconverted back to the form having one line per paragraph.\r
135\r
136 If additional line breaks are added, then in order remove the \r
137line breaks and reconstruct the original paragraphs, so that the \r
138page width can be adjusted, perform the following manipulations:\r
139 (1) convert each line break (cr-lf combination) to a space.\r
140 (2) convert the string "</p> " (</p> followed by two spaces)\r
141 to </p> followed by two line breaks (cr-lf combinations)\r
142 (3) convert the string "<br/ " (<br/ followed by one space)\r
143 to <br/ followed by one line break (cr-lf).\r
144There will be some "lines" (paragraphs) with over 12,000 characters,\r
145which may give trouble to some simple text editors.\r
146 A more sophisticated formatting of spaces within paragraphs may\r
147require the use of the fully-tagged master files. If you have\r
148a need for these files, contact Patrick Cassidy: cassidy@micra.com.\r
149 The approximate beginning of each page is marked by an SGML\r
150comment of the form <-- p. 345 -->. (The exact beginning was in some\r
151cases in the middle of a paragraph, which we decided was not a\r
152good location for these page-number comments, so the page number\r
153was usually moved to the next paragraph break). Pages which have \r
154been proofread by volunteers (e.g., with initials VOL) will have a \r
155note within that page comment: <-- p. 345 pr=VOL -->. Pages which have \r
156not been proofread yet (most of them) will have varying numbers of \r
157typographical errors in them. We still (January 2002) need \r
158proofreaders to get the errors out of these dictionary files.\r
159\r
160***********************************************************************\r
161** WARNING!!! **\r
162***********************************************************************\r
163\r
164 This version is only a first typing, and has numerous typographic\r
165errors, including errors in the field-marks. In addition, the user must\r
166keep in mind that this text is very old and will contain numerous \r
167obsolete, inaccurate, and perhaps offensive statements, which are \r
168included solely because this work is intended to reproduce accurately\r
169this historically interesting classic reference work. This text should \r
170not be relied upon as an accurate source of information, as in many\r
171cases it represents the state of knowledge around 1890. The text is\r
172provided "as is", and the user must accept responsibility for all\r
173consequences of its use. Please refer to the header of each file and\r
174the GNU public license. If these conditions of use are unacceptable,\r
175please do not use these texts.\r
176************************************************************************\r
177************************************************************************\r
178 This electronic dictionary is also made available as a potential\r
179starting point for development of a modern comprehensive encyclopedic\r
180dictionary, to be accessible freely on the internet, and developed by the\r
181efforts of all individuals willing to help build a large and freely\r
182available knowledge base. A large number of collaborators are needed to\r
183bring this dictionary to a more accurate, more modern, and more useful\r
184state. Anyone willing to assist in any way in constructing such a \r
185knowledge base should contact Patrick Cassidy (see above). All reports \r
186of errors will be gratefully received, and should also be transmitted to \r
187PC at: pc@worldsoul.org.\r
188\r
189In addition to the main text of the dictionary, additional\r
190explanatory material about this version of the dictionary is available\r
191in the ancillary files:\r
192\r
193=====================================================================\r
194COPYING 18,321 11-03-99 1:13a COPYING\r
195README DIC 13,775 01-17-02 11:48p readme.dic\r
196WEBFONT ASC 35,234 12-12-01 3:27p WEBFONT.ASC\r
197TAGSET WEB 55,843 08-16-01 1:16p TAGSET.WEB\r
198PRONUNC WEB 14,312 06-18-00 3:02p PRONUNC.WEB\r
199PRONUNC JPG 2,569,796 06-18-00 3:11p PRONUNC.JPG\r
200SYMBOLS JPG 144,716 06-18-00 3:13p SYMBOLS.JPG\r
201WXXVII JPG 1,188,380 06-18-00 3:19p WXXVII.JPG\r
202==================================================================\r
203\r
204\r
205Most important tags used in the GCIDE:\r
206<hw> tags the headword\r
207<pr> pronunciation\r
208<pos> part of speech\r
209<ety> etymology\r
210<ets> "source" word within an <ety> field, usually foreign words\r
211<fld> field of knowledge (e.g. Med. = medicine)\r
212<def> definition\r
213<cs> collocation section (containing word combinations)\r
214<col> collocation entry (word combination)\r
215<cd> collocation definition\r
216<as> illustrations of usage (within a <def>. . . </def> field)\r
217<au> authority for a definition, or author of a quotation\r
218<q> illustrative quotation -- in block quote format\r
219<au> author of an illustrative <q> quotation\r
220<altname> alternative name for the headword -- essentially a synonym\r
221<asp> alternative spelling of the headword\r
222<syn> list of synonyms for the headword\r
223<p> paragraph\r
224<b> bold type\r
225<it> italic type\r
226\r
227For other tags, see the file "tagset.web"\r
228\r
229\r
230============================================================\r
231 OTHER VERSIONS OF THE DICTIONARY\r
232=============================================================\r
233\r
234 There are several other derivative versions of this dictionary \r
235on the internet, in some cases reformatted or provided with an \r
236interface. Those that I am aware of are:\r
237\r
238(1) Project Gutenberg\r
239---------------------\r
240 In the extext96 directory of Project Gutenberg (www.prairienet.org)\r
241there is a version of the original 1913 dictionary, which is in\r
242the **public domain**. The main files are in the directory etext96,\r
243and sre labeled pgw050**.***. The tags for that version are a subset\r
244of those used in this GNU version.\r
245\r
246(2) The DICT development group\r
247------------------------------\r
248This group has created a program to index and search this dictionary.\r
249The program can be downloaded and used locally, but at present\r
250is available only in a Unix-compatible executable version.\r
251See their web site at http://www.dict.org.\r
252\r
253(3) The University of Chicago ARTFL project\r
254---------------------------------------------\r
255Mark Olsen and Gavin LaRowe at the University of Chicago have \r
256converted the original 1913 dictionary to HTML and have provided an\r
257interface allowing search of the headwords. When the supplemented\r
258version has developed sufficiently to warrant the effort, a \r
259similar searchable version may be posted there as well. The\r
260search page is at:\r
261 http://humanities.uchicago.edu/forms_unrest/webster.form.html\r
262\r
263That page will provide links to other ARTFL projects and contact\r
264information for the ARTFL group, who alone can provide information \r
265about the HTML version or interface.\r
266\r
267\r
268 -- PJC\r
This page took 0.193719 seconds and 4 git commands to generate.