on the Curl Web Content Markup Language

on the Curl Web Content Markup and Programming Language from www.curl.com and www.curlap.com
Showing posts with label katakana. Show all posts
Showing posts with label katakana. Show all posts

Friday, June 8, 2012

Kanjidic2 Japanese kanji by School Grade

There is a new set of 12 Curl applets for studying Japanese kanji this morning.

They are linked at http://www.aule-browser.com/kanji/kanjidic2-grades.html

Here is a snapshot of the applet for combined Grade 9 and Grade 10 Japanese kanji from kanjidic2 with meanings and both on and kun readings in katakana and hiragana:


These pages require the MIT Curl Surge RTE browser plug-in from www.curl.com

Sunday, June 3, 2012

serialization for text source integrity


I now have a Curl applet up using only serialized data as the source - this will make all of the "Learn Kanji" applets faster hereafter and help keep my fingers off my input text source !

The applet uses the {deserialize } macro to load its data and does not reach the original source.

The approach will result in much faster "kanji of the day" applets using the Basho Haiku as their learning resource as no iteration through the data will occur at load time and the applets are one degree away from the vulnerable source.

This will be even more significant when working with no SQL and no JSON for the large JMDict Japanese resource as well as the smaller Kanjidic2 and its Edict2.

Entire arrays are serialized with a simple call to a stream to write one object.  The object classes are declared as serializable and all affected fields have at least a default value.

The result is that my own annotations are kept independent of the source file.

I may turn this same approach to an applet for viewing the tags in my 9000+ Firefox booksmarks so as to avoid both HTML and JSON.

Here is a snap:






Monday, May 28, 2012

Curl data versus JSON data (light-weight markup)

Here is the top of a validated JSON katakana data file for Japanese e-learning:

{"katakana": [
{"ucs": "30AB", "utf-8": "E382AB", "kana": "カ", "info": "katakana letter KA"},
{"ucs": "30AC", "utf-8": "E382AC", "kana": "ガ", "info": "katakana letter GA"},
{"ucs": "30AD", "utf-8": "E382AD", "kana": "キ", "info": "katakana letter KI"},
{"ucs": "30AE", "utf-8": "E382AE", "kana": "ギ", "info": "katakana letter GI"},

and here is the Curl:

{let katakana-array:{Array-of Katakana} = {new {Array-of Katakana},
{Katakana "30AB", "E382AB", "カ", "katakana letter KA"},
{Katakana "30AC", "E382AC", "ガ", "katakana
letter GA"},
{Katakana "30AD", "E382AD", "キ", "katakana
letter KI"},
{Katakana "30AE", "E382AE", "ギ", "katakana
letter GI"},

In Curl, both require field definitions for processing - except that the Curl data requres a minimal class definition and a default constructor declaring all fields as being assigned (simple value class).

Of course both could have been reduced to mere arrays of strings, but then the iteration over the data would use no tags or keys.  The Curl version is tagged, but internally:

{define-value-class public final Katakana
  field private constant ucs-code:String || = "0000"
  field private constant utf8-code:String || = "000000"
  field private constant kana-char:String || = {String '\u5B57'} || "字"
  field private constant kana-name:String || = "Ji"
  {getter public {ucs}:String
    {return self.ucs-code}
  }
  {getter public {utf-8}:String
    {return self.utf8-code}
  }
  {getter public {katakana}:String
    {return self.kana-char}
  }
  {getter public {character}:String
    {return self.kana-name}
  }
  {constructor {default ucs:String, utf:String, kana:String, info:String}
    set self.ucs-code = ucs
    set self.utf8-code = utf
    set self.kana-char = kana
    set self.kana-name = info
  }
 }
{include "./katakana-unicode.scurl"}

The iterator block accesses each instance as, e.g., val.ucs and so forth.

In fairness, the JSON could have been

{"katakana-array": [
{"katakana": ["ucs": "30AB", "utf-8": "E382AB", "kana": "カ", "info": "katakana letter KA"]},

but it further complicated iterating over the data.

In the Curl applet the Curl data is processed dramatically faster, naturally.

The JSON data can be used anywhere, e.g., by Pharo Smalltalk or jQuery in web page widgets.

Note: the UTF-8 can be used to urlencode:  "E382B6" becomes %E3%82%B6 for a URL.

Here is one result using the Curl data: (click to view)

 

The applet is located at www.aule-browser.com/kanji/kana-charts.html