2011年8月9日星期二

the works in Enchant

1 add hyphenation function in Enchant
what I have done:
Firstly, I add hyphenation module in Enchant:
================about the code===========
I think we can combine the hyphenation with spell-checking together.
so that we can make the code more flexsible.
In my opinion, we can make code like this:
EnchantDict* enchant_broker_request_dict (EnchantBroker* broker, const
char *const lang); //same as spell-checking
char *enchant_dict_hyphenate(EnchantDict *dict, const char *const
word,size_t len);

===================== 1
In order to achieve this, we need to add  hyphenation function in
EnchantDict. something like:
char **(*hyphenate) (struct str_enchant_dict * me,
                          const char *const word, size_t len,
                          size_t * out_n_suggs);

======================== 2
and the function is implement by the backend.
static char **
ispell_dict_hyphenate (EnchantDict * me, const char *const word,
                    size_t len, size_t * out_n_suggs)
{
       ISpellChecker * checker;

       checker = (ISpellChecker *) me->user_data;
       return checker->hyphenate (word, len, out_n_suggs);
}
 ====================3
 finally, we set the connetion

 dict->hyphenate = ispell_dict_hyphenate;
 dict->suggest = hspell_dict_hyphenate;
dict->suggest = zemberek_dict_hyphenate;

2 add five backends to support hyphenation
  including ispell, myspell, zemberek, voikko, uspell

we all know that Enchant has eight types of backend to support multi-language.
USpell
HSpell
ASpell
MySpell....

but I found that in Abiword. we only use two backends:
libenchant_ispell.dll  libenchant_myspell.dll

another surprising thing is that in the dictionary\ispell I can only
find one dic: american.hash(En_US)
and the hunspell dic is empty~~


the changes:
1 deleted the unneed connection, such as HSpell
2 add hunspell(myspell) hyphenation code
3 implement hyphenation using hunspell
4 implement hyphenation using Zemberek

================1 deleted the unneed connection, such as HSpell===========
Hebrew dont need any hyphenation
Yiddish dont need any hyphenation

====================2  implement hyphenation using hunspell
added files:
hyphen/hnjalloc.h
hyphen/hnjalloc.c
hyphen/hyph_en_US.dic
hyphen/hyphen.c
hyphen/hyphen.gyp
hyphen/hyphen.h
hyphen/hyphen.patch
hyphen/hyphen.tex

need more tests

==================== 3 implement hyphenation using Zemberek
 just using dbus_g_proxy_call:

 char* Zemberek::hyphenate(const char* word)
{
       char* result;
       GError *Error = NULL;
       if (!dbus_g_proxy_call (proxy, "hecele", &Error,
               G_TYPE_STRING,word,G_TYPE_INVALID,
               G_TYPE_STRV, &result,G_TYPE_INVALID)) {
                       g_error_free (Error);
                       return NULL;
       }

       char*result=0;
       return result;
}


The attachment is the updated version today. Now we have four backends
of hyphenantion
1 hunspell: using seperated dic: such as hyph_en_us.dic.
  we can download dic from internet
2 libhyphenaiton: the dic is provided by author, sometimes limited
3 zemberek: for Turkis
4 Voikko: for Finnish


3 the depot of enchant
I just copy the bulid result of enchant:
enchant\bin\Debug\libenchant_myspell.dll ---->
abiword\msvc2008\Debug\lib\enchant\libenchant_myspell.dll
enchant\bin\Debug\libenchant_ispell.dll ---->
abiword\msvc2008\Debug\lib\enchant\libenchant_ispell.dll
enchant\bin\Debug\libenchant.dll ---->abiword\msvc2008\Debug\bin\ibenchant.dll

3 the implementation of the five backends
4 call hyphenation module of enchant to display the hyphenation-result
in abiword
5 after user's operation, refresh the hyphenation-result accordingly
  include user adding new word, delete word, copy word, cut word


6 test in Linux
todo:
1 Fully support hyphenation in Abiword
2 Support more language besides En_us
3 Some tests in Linux(Unix)


Still to improve:
1 code re-factor
2 deal with more language
3 include more user's operation(such as operate with picture may
influence the hyphenation result)

没有评论:

发表评论