# TODO:
# - warnings at compile stage about pointer size on amd64 - needs check
-# - build dynamic library, not the static one
Summary: Tesseract Open Source OCR Engine
Summary(pl.UTF-8): Tesseract - silnik OCR o otwartych źródłach
Name: tesseract
-Version: 2.00
-Release: 0.9
-License: Apache Software License v2
+Version: 3.04.00
+Release: 0.1
+License: Apache v2.0
Group: Applications/Graphics
-Source0: http://tesseract-ocr.googlecode.com/files/%{name}-%{version}.tar.gz
-# Source0-md5: 6d68d940ed15c61300cb04019c30f46c
-Source1: http://tesseract-ocr.googlecode.com/files/%{name}-%{version}.eng.tar.gz
-# Source1-md5: b8291d6b3a63ce7879d688e845e341a9
-Source2: http://tesseract-ocr.googlecode.com/files/%{name}-%{version}.fra.tar.gz
-# Source2-md5: 64896b462e62572a3708bb461820126c
-Source3: http://tesseract-ocr.googlecode.com/files/%{name}-%{version}.ita.tar.gz
-# Source3-md5: 2759e1dae91a989a43490ff4c2253a4b
-Source4: http://tesseract-ocr.googlecode.com/files/%{name}-%{version}.deu.tar.gz
-# Source4-md5: 609d91b1ae3759a756b819b5d8403653
-Source5: http://tesseract-ocr.googlecode.com/files/%{name}-%{version}.spa.tar.gz
-# Source5-md5: bc26a777b2384613895677cb8e61ca75
-Source6: http://tesseract-ocr.googlecode.com/files/%{name}-%{version}.nld.tar.gz
-# Source6-md5: b2f6ede182cea4bbfffd3b040533ce58
-Patch0: %{name}-globals.patch
+Source0: https://github.com/tesseract-ocr/tesseract/archive/%{version}.tar.gz
+# Source0-md5: 078130b9c7d28c558a0e49d432505864
URL: http://code.google.com/p/tesseract-ocr/
+BuildRequires: autoconf >= 2.50
BuildRequires: automake
-BuildRequires: libtiff-devel
+BuildRequires: leptonlib-devel >= 1.71
+BuildRequires: libstdc++-devel
+BuildRequires: libtool
+Suggests: tesseract-data >= 3
BuildRoot: %{tmpdir}/%{name}-%{version}-root-%(id -u -n)
%description
latach 1985-1995. W 1995 roku był jednym z 3 najlepszych wg UNLV.
Źródła zostały uwolnione przez HP i UNLV w 2005 roku.
-%package lang-de
-Summary: German language data for Tesseract
-Summary(pl.UTF-8): Dane języka niemieckiego dla Tesseracta
-Group: Applications/Graphics
-Requires: %{name} = %{version}-%{release}
-Obsoletes: tesseract-deu
-
-%description lang-de
-This package contains the data files required to recognize German
-language.
-
-%description lang-de -l pl.UTF-8
-Ten pakiet zawiera pliki danych potrzebne do rozpoznawania języka
-niemieckiego.
-
-%package lang-en
-Summary: English language data for Tesseract
-Summary(pl.UTF-8): Dane języka angielskiego dla Tesseracta
-Group: Applications/Graphics
-Requires: %{name} = %{version}-%{release}
-Obsoletes: tesseract-eng
-
-%description lang-en
-This package contains the data files required to recognize English
-language.
-
-%description lang-en -l pl.UTF-8
-Ten pakiet zawiera pliki danych potrzebne do rozpoznawania języka
-angielskiego.
-
-%package lang-es
-Summary: Spanish language data for Tesseract
-Summary(pl.UTF-8): Dane języka hiszpańskiego dla Tesseracta
-Group: Applications/Graphics
-Requires: %{name} = %{version}-%{release}
-Obsoletes: tesseract-spa
-
-%description lang-es
-This package contains the data files required to recognize Spanish
-language.
-
-%description lang-es -l pl.UTF-8
-Ten pakiet zawiera pliki danych potrzebne do rozpoznawania języka
-hiszpańskiego.
-
-%package lang-fr
-Summary: French language data for Tesseract
-Summary(pl.UTF-8): Dane języka francuskiego dla Tesseracta
-Group: Applications/Graphics
-Requires: %{name} = %{version}-%{release}
-Obsoletes: tesseract-fra
-
-%description lang-fr
-This package contains the data files required to recognize French
-language.
-
-%description lang-fr -l pl.UTF-8
-Ten pakiet zawiera pliki danych potrzebne do rozpoznawania języka
-francuskiego.
-
-%package lang-it
-Summary: Italian language data for Tesseract
-Summary(pl.UTF-8): Dane języka włoskiego dla Tesseracta
-Group: Applications/Graphics
-Requires: %{name} = %{version}-%{release}
-Obsoletes: tesseract-ita
-
-%description lang-it
-This package contains the data files required to recognize Italian
-language.
-
-%description lang-it -l pl.UTF-8
-Ten pakiet zawiera pliki danych potrzebne do rozpoznawania języka
-włoskiego.
-
-%package lang-nl
-Summary: Dutch language data for Tesseract
-Summary(pl.UTF-8): Dane języka holenderskiego dla Tesseracta
-Group: Applications/Graphics
+%package devel
+Summary: Header files for Tesseract libraries
+Summary(pl.UTF-8): Pliki nagłówkowe bibliotek Tesseracta
+Group: Development/Libraries
Requires: %{name} = %{version}-%{release}
-Obsoletes: tesseract-nl
+Requires: leptonlib-devel
+Requires: libstdc++-devel
-%description lang-nl
-This package contains the data files required to recognize Dutch
-language.
+%description devel
+This package contains the development header files necessary to
+develop applications using Tesseract API.
-%description lang-nl -l pl.UTF-8
-Ten pakiet zawiera pliki danych potrzebne do rozpoznawania języka
-holenderskiego.
+%description devel -l pl.UTF-8
+Ten pakiet zawiera pliki nagłówkowe potrzebne do tworzenia programów
+wykorzystujących API Tesseracta.
-%package devel
-Summary: Tesseract - Development header files and libraries
-Summary(pl.UTF-8): Tesseract - Pliki nagłówkowe i biblioteki dla programistów
+%package static
+Summary: Static Tesseract libraries
+Summary(pl.UTF-8): Statyczne biblioteki Tesseracta
Group: Development/Libraries
+Requires: %{name}-devel = %{version}-%{release}
-%description devel
-This package contains the development header files and libraries
-necessary to develop applications using Tesseract.
+%description static
+Static Tesseract libraries.
+
+%description static -l pl.UTF-8
+Statyczne biblioteki Tesseracta.
%prep
%setup -q
-#%patch0 -p1
-tar xzf %{SOURCE1}
-tar xzf %{SOURCE2}
-tar xzf %{SOURCE3}
-tar xzf %{SOURCE4}
-tar xzf %{SOURCE5}
-tar xzf %{SOURCE6}
%build
-cp -f /usr/share/automake/config.sub config
+%{__libtoolize}
+%{__aclocal}
+%{__autoconf}
+%{__autoheader}
+%{__automake}
%configure
%{__make}
%{__make} install \
DESTDIR=$RPM_BUILD_ROOT
+# test program?
+%{__rm} $RPM_BUILD_ROOT%{_bindir}/classifier_tester
+%{__rm} $RPM_BUILD_ROOT%{_libdir}/libtesseract.la
+
%clean
rm -rf $RPM_BUILD_ROOT
+%post -p /sbin/ldconfig
+%postun -p /sbin/ldconfig
+
%files
%defattr(644,root,root,755)
-%doc AUTHORS COPYING ChangeLog README
+%doc AUTHORS COPYING ChangeLog README ReleaseNotes
+%attr(755,root,root) %{_bindir}/ambiguous_words
%attr(755,root,root) %{_bindir}/cntraining
+%attr(755,root,root) %{_bindir}/combine_tessdata
+%attr(755,root,root) %{_bindir}/dawg2wordlist
%attr(755,root,root) %{_bindir}/mftraining
+%attr(755,root,root) %{_bindir}/shapeclustering
%attr(755,root,root) %{_bindir}/tesseract
%attr(755,root,root) %{_bindir}/unicharset_extractor
%attr(755,root,root) %{_bindir}/wordlist2dawg
+%attr(755,root,root) %{_libdir}/libtesseract.so.*.*.*
+%attr(755,root,root) %ghost %{_libdir}/libtesseract.so.3
%dir %{_datadir}/tessdata
-%{_datadir}/tessdata/confsets
%dir %{_datadir}/tessdata/configs
%{_datadir}/tessdata/configs/*
%dir %{_datadir}/tessdata/tessconfigs
%{_datadir}/tessdata/tessconfigs/*
+%{_mandir}/man1/ambiguous_words.1*
+%{_mandir}/man1/cntraining.1*
+%{_mandir}/man1/combine_tessdata.1*
+%{_mandir}/man1/dawg2wordlist.1*
+%{_mandir}/man1/mftraining.1*
+%{_mandir}/man1/shapeclustering.1*
+%{_mandir}/man1/tesseract.1*
+%{_mandir}/man1/unicharset_extractor.1*
+%{_mandir}/man1/wordlist2dawg.1*
-%files lang-de
-%defattr(644,root,root,755)
-%{_datadir}/tessdata/deu.*
-
-%files lang-en
-%defattr(644,root,root,755)
-%{_datadir}/tessdata/eng.*
-
-%files lang-es
-%defattr(644,root,root,755)
-%{_datadir}/tessdata/spa.*
-
-%files lang-fr
-%defattr(644,root,root,755)
-%{_datadir}/tessdata/fra.*
-
-%files lang-it
-%defattr(644,root,root,755)
-%{_datadir}/tessdata/ita.*
-
-%files lang-nl
+%files devel
%defattr(644,root,root,755)
-%{_datadir}/tessdata/nld.*
+%attr(755,root,root) %{_libdir}/libtesseract.so
+%{_includedir}/%{name}
+%{_pkgconfigdir}/tesseract.pc
+%{_mandir}/man5/unicharambigs.5*
+%{_mandir}/man5/unicharset.5*
-%files devel
+%files static
%defattr(644,root,root,755)
-%dir %{_includedir}/%{name}
-%{_includedir}/%{name}/*.h
-%{_libdir}/*.a
+%{_libdir}/libtesseract.a