94433f7f0f87968dc979d54f02ffb1e3 dict-gcide-README.DIC

author ankry <ankry@pld-linux.org>

Tue, 7 Jan 2003 10:27:24 +0000 (10:27 +0000)

committer cvs2git <feedback@pld-linux.org>

Sun, 24 Jun 2012 12:13:13 +0000 (12:13 +0000)
author ankry <ankry@pld-linux.org>
Tue, 7 Jan 2003 10:27:24 +0000 (10:27 +0000)
committer cvs2git <feedback@pld-linux.org>
Sun, 24 Jun 2012 12:13:13 +0000 (12:13 +0000)
diff --git a/dict-gcide-README.DIC b/dict-gcide-README.DIC

new file mode 100644 (file)

index 0000000..780e0bb
--- /dev/null
+++ b/dict-gcide-README.DIC
@@ -0,0 +1,268 @@
+File  README.DIC\r
+  To accompany the GNU version of the set of files (cide.*) containing \r
+                the electronic version of the\r
+       Collaborative International Dictionary of English.\r
+                   (called also GCIDE)\r
+       These files contain Version 0.46 (January 2002)\r
+    * * * * * * * * * * * * * * * * * * * * * * * * * * * *\r
+\r
+The dictionary was derived from the\r
+         Webster's Revised Unabridged Dictionary\r
+                 Version published 1913\r
+               by the  C. & G. Merriam Co.\r
+                   Springfield, Mass.\r
+                 Under the direction of\r
+                Noah Porter, D.D., LL.D.\r
+\r
+and has been supplemented with some of the definitions from\r
+           WordNet, a semantic network created by\r
+              the Cognitive Science Department\r
+                 of Princeton University\r
+                  under the direction of\r
+                   Prof. George Miller\r
+\r
+and is being proof-read and supplemented by volunteers from\r
+around the world.  This is an unfunded project, and future\r
+enhancement of this dictionary will depend on the efforts of\r
+volunteers willing to help build this free resource into a\r
+comprehensive body of general information.  New definitions\r
+for missing words or words senses and longer explanatory notes, \r
+as well as images to accompany the articles are needed.  More\r
+modern illustrative quotations giving recent examples of\r
+usage of the words in their various senses will be very\r
+helpful, since most quotations in the original 1913 dictionary\r
+are now well over 100 years old.\r
+\r
+   This electronic version is being maintained by World Soul,\r
+a non-profit organization in Plainfield, NJ.  For additional\r
+information or if you are willing to assist construction of this\r
+data source, contact:\r
+\r
+=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=\r
+ Patrick J. Cassidy              | TEL:          (908) 561-3416\r
+ World Soul                      | if no answer, (908) 668-5252\r
+ 735 Belvidere Ave.              | FAX:          (908) 668-5904\r
+ Plainfield, NJ  07062-2054\r
+ pc@worldsoul.org   or  cassidy@micra.com\r
+=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=\r
+\r
+  * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * \r
+\r
+GCIDE is free software; you can redistribute it and/or modify\r
+it under the terms of the GNU General Public License as published by\r
+the Free Software Foundation; either version 2, or (at your option)\r
+any later version.\r
+\r
+GCIDE is distributed in the hope that it will be useful,\r
+but WITHOUT ANY WARRANTY; without even the implied warranty of\r
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the\r
+GNU General Public License for more details.\r
+\r
+You should have received a copy of the GNU General Public License\r
+along with this copy of GCIDE; see the file COPYING.  If not, write \r
+to the Free Software Foundation, Inc., 59 Temple Place - Suite 330,\r
+Boston, MA 02111-1307, USA.\r
+           * * * * * * * * * * * * * * * * * * * * * \r
+\r
+STRUCTURE OF THE DICTIONARY\r
+---------------------------\r
+   When the archives are unpacked, the main dictionary text of \r
+the GCIDE will be found in 26 files named "cide.*", where the\r
+asterisk indicates which letter of the alphabet begins the\r
+words in each file.  For example, file "cide.b" contains words \r
+beginning with the letter "B".  Additional information about the \r
+tagging conventions and special character symbols are contained in \r
+ancillary files in this directory more information below).  The main \r
+body of the 1913 dictionary was essentially identical to the edition\r
+published in 1890, and was republished in 1913 with an appendix \r
+containing "New Words".  The new words of that appendix have been\r
+integrated into the main file in this version.  However, it is important \r
+to keep in mind that the definitions in this dictionary are in most \r
+cases over 100 years old.  Use them with caution!\r
+    At the bottom of each paragraph in this dictionary, there is a\r
+bracketed and tagged "source" indicated.  This tells from where the\r
+definition or other text in that paragraph came, as follows:\r
+\r
+[<source>1913 Webster</source>]\r
+  =  From the original 1890 dictionary.\r
+[<source>Webster 1913 Suppl.</source>]\r
+  =  From the 1913 "New Words" supplement to the Webster.\r
+[<source>WordNet 1.5</source>]\r
+  =  From the WordNet on-line semantic network.\r
+[<source>Century Dict. 1906.</source>]\r
+  =  From the Century Dictionary published in 1906, especially from\r
+          the "proper Names" supplement (volume IX).\r
+                                     published\r
+[<source>XXX</source>]\r
+   = Added by one of the volunteers.\r
+\r
+    The original definitions have been tagged and in some cases \r
+reformatted or slightly rearranged.  If substantive information\r
+is added from a second source, usually the additional source is\r
+also noted, as in:\r
+[<source>Webster 1913 Suppl.</source> + <source>WordNet 1.5</source>]\r
+\r
+    A list of the ancillary files related to the GCIDE is appended at \r
+the bottom of this "readme.dic" file.\r
+    This version is tagged with SGML-like tags of the form <pos>...</pos>\r
+so that the original typography (italics, bold, block quotes) can be\r
+reproduced.  A list of the most important tags for fields in the \r
+dictionary is given below.  The tags also serve the more important \r
+function of allowing the information content to be conveniently imported\r
+into computer programs or databases.  The set of tags used is described \r
+in the accompanying file "tagset.web".  ***NOTE*** the paragraph tags\r
+<p>...</p> do *not* always nest properly with certain other tags, such \r
+as <note> and <cs> ("collocation section"), which in some cases span\r
+multiple paragraphs.  If you are using a tag parser which detects\r
+improper nesting, you should first either delete the paragraph\r
+tags or convert them to non-tag symbols, or, if possible, set the \r
+parser to ignore the <p>...</p> tags.\r
+    The unusual characters (such as Greek or the European accented\r
+characters, as well as special characters used in the pronunciations)\r
+are described in the accompanying file "webfont.asc".  Some information\r
+on the pronunciation system used may be found by viewing the files\r
+"wxxvii.jpg" and "pronunc.jpg" with a GIF viewer (or any web browser),\r
+and additional explanations of pronunciation are in the file \r
+"pronunc.web".\r
+     Each paragraph of the original text is enclosed within tags of \r
+the form <p> . . . </p>.  Within these paragraphs are no line\r
+breaks, and some of the paragraphs are over 12,000 characters long.\r
+These lines are too long to be handled by the vi editor, and probably\r
+by some other text editors.  At some points, embedded line breaks within \r
+a "paragraph" are marked by a <br/ "entity".  The file can therefore\r
+be converted, if necessary, to a form with shorter lines, and subsequently\r
+reconverted back to the form having one line per paragraph.\r
+\r
+   If additional line breaks are added, then in order remove the \r
+line breaks and reconstruct the original paragraphs, so that the \r
+page width can be adjusted, perform the following manipulations:\r
+  (1) convert each line break (cr-lf combination) to a space.\r
+  (2) convert the string "</p>  " (</p> followed by two spaces)\r
+     to </p> followed by two line breaks (cr-lf combinations)\r
+  (3) convert the string "<br/ " (<br/ followed by one space)\r
+     to <br/ followed by one line break (cr-lf).\r
+There will be some "lines" (paragraphs) with over 12,000 characters,\r
+which may give trouble to some simple text editors.\r
+   A more sophisticated formatting of spaces within paragraphs may\r
+require the use of the fully-tagged master files.  If you have\r
+a need for these files, contact Patrick Cassidy: cassidy@micra.com.\r
+   The approximate beginning of each page is marked by an SGML\r
+comment of the form <-- p. 345 -->.  (The exact beginning was in some\r
+cases in the middle of a paragraph, which we decided was not a\r
+good location for these page-number comments, so the page number\r
+was usually moved to the next paragraph break).  Pages which have \r
+been proofread by volunteers (e.g., with initials VOL) will have a \r
+note within that page comment: <-- p. 345 pr=VOL -->.  Pages which have \r
+not been proofread yet (most of them) will have varying numbers of \r
+typographical errors in them.   We still (January 2002) need \r
+proofreaders to get the errors out of these dictionary files.\r
+\r
+***********************************************************************\r
+**                        WARNING!!!                                 **\r
+***********************************************************************\r
+\r
+    This version is only a first typing, and has numerous typographic\r
+errors, including errors in the field-marks.  In addition, the user must\r
+keep in mind that this text is very old and will contain numerous \r
+obsolete, inaccurate, and perhaps offensive statements, which are \r
+included solely because this work is intended to reproduce accurately\r
+this historically interesting classic reference work.  This text should \r
+not be relied upon as an accurate source of information, as in many\r
+cases it represents the state of knowledge around 1890.  The text is\r
+provided "as is", and the user must accept responsibility for all\r
+consequences  of its use. Please refer to the header of each file and\r
+the GNU public license.  If these conditions of use are unacceptable,\r
+please do not use these texts.\r
+************************************************************************\r
+************************************************************************\r
+    This electronic dictionary is also made available as a potential\r
+starting point for development of a modern comprehensive encyclopedic\r
+dictionary, to be accessible freely on the internet, and developed by the\r
+efforts of all individuals willing to help build a large and freely\r
+available knowledge base.  A large number of collaborators are needed to\r
+bring this dictionary to a more accurate, more modern,  and more useful\r
+state. Anyone willing to assist in any way in constructing such a \r
+knowledge base should contact Patrick Cassidy (see above).  All reports \r
+of errors will be gratefully received, and should also be transmitted to \r
+PC at: pc@worldsoul.org.\r
+\r
+In addition to the main text of the dictionary, additional\r
+explanatory material about this version of the dictionary is available\r
+in the ancillary files:\r
+\r
+=====================================================================\r
+COPYING             18,321  11-03-99  1:13a COPYING\r
+README   DIC        13,775  01-17-02 11:48p readme.dic\r
+WEBFONT  ASC        35,234  12-12-01  3:27p WEBFONT.ASC\r
+TAGSET   WEB        55,843  08-16-01  1:16p TAGSET.WEB\r
+PRONUNC  WEB        14,312  06-18-00  3:02p PRONUNC.WEB\r
+PRONUNC  JPG     2,569,796  06-18-00  3:11p PRONUNC.JPG\r
+SYMBOLS  JPG       144,716  06-18-00  3:13p SYMBOLS.JPG\r
+WXXVII   JPG     1,188,380  06-18-00  3:19p WXXVII.JPG\r
+==================================================================\r
+\r
+\r
+Most important tags used in the GCIDE:\r
+<hw> tags the headword\r
+<pr>          pronunciation\r
+<pos>         part of speech\r
+<ety>         etymology\r
+<ets>         "source" word within an <ety> field, usually foreign words\r
+<fld>         field of knowledge (e.g. Med. = medicine)\r
+<def>         definition\r
+<cs>          collocation section  (containing word combinations)\r
+<col>         collocation entry (word combination)\r
+<cd>          collocation definition\r
+<as>          illustrations of usage (within a <def>. . . </def> field)\r
+<au>          authority for a definition, or author of a quotation\r
+<q>           illustrative quotation -- in block quote format\r
+<au>          author of an illustrative <q> quotation\r
+<altname>     alternative name for the headword -- essentially a synonym\r
+<asp>         alternative spelling of the headword\r
+<syn>         list of synonyms for the headword\r
+<p>           paragraph\r
+<b>           bold type\r
+<it>          italic type\r
+\r
+For other tags, see the file "tagset.web"\r
+\r
+\r
+============================================================\r
+            OTHER VERSIONS OF THE DICTIONARY\r
+=============================================================\r
+\r
+   There are several other derivative versions of this dictionary \r
+on the internet, in some cases reformatted or provided with an \r
+interface.  Those that I am aware of are:\r
+\r
+(1) Project Gutenberg\r
+---------------------\r
+   In the extext96 directory of Project Gutenberg (www.prairienet.org)\r
+there is a version of the original 1913 dictionary, which is in\r
+the **public domain**.  The main files are in the directory etext96,\r
+and sre labeled pgw050**.***.  The tags for that version are a subset\r
+of those used in this GNU version.\r
+\r
+(2) The DICT development group\r
+------------------------------\r
+This group has created a program to index and search this dictionary.\r
+The program can be downloaded and used locally, but at present\r
+is available only in a Unix-compatible executable version.\r
+See their web site at http://www.dict.org.\r
+\r
+(3)  The University of Chicago ARTFL project\r
+---------------------------------------------\r
+Mark Olsen and Gavin LaRowe at the University of Chicago have \r
+converted the original 1913 dictionary to HTML and have provided an\r
+interface allowing search of the headwords.  When the supplemented\r
+version has developed sufficiently to warrant the effort, a \r
+similar searchable version may be posted there as well.  The\r
+search page is at:\r
+  http://humanities.uchicago.edu/forms_unrest/webster.form.html\r
+\r
+That page will provide links to other ARTFL projects and contact\r
+information for the ARTFL group, who alone can provide information \r
+about the HTML version or interface.\r
+\r
+\r
+ -- PJC\r
author	ankry <ankry@pld-linux.org>
	Tue, 7 Jan 2003 10:27:24 +0000 (10:27 +0000)
committer	cvs2git <feedback@pld-linux.org>
	Sun, 24 Jun 2012 12:13:13 +0000 (12:13 +0000)