on the Curl Web Content Markup Language

on the Curl Web Content Markup and Programming Language from www.curl.com and www.curlap.com
Showing posts with label text. Show all posts
Showing posts with label text. Show all posts

Thursday, October 16, 2014

Curl for northerly star charts


I have added chart data files such as hygxyz_northofminus40_ucs.csv through hygxyz_northofminus10_ucs.csv and a hygxyz_north_only_ucs.csv at URL's such as

http://www.aule-browser.com/astro/stars/hygxyz_northofminus40_ucs.csv

for which the row counts vary from  93,132 down to 58,767 CVS data rows with the top 2 rows the usual info-only. The files are comma-separated fields of variable length encoded as utf-8.




Monday, September 8, 2014

Curl char type and the CharGraphic class


Over at kanji.aule-browser.com I have a Curl applet embedded in HTML as part of my efforts to learn to read and write Japanese kanji.

What is interesting is this : try to select the top 6 or 7 rows in the left-hand widget ; now try to paste them into the HTML TextArea below or the Curl RichTextEdit to the right.

See something odd?  Two of the kanji are not copied.  Try again and observe the selection range.  Two kanji do not hi-light.

Expression-based Curl is homoiconic, so here is the declarative code/data implicated :

{krow "0014","0000",'〇'}
{krow "0002","2850",'一'}
{krow "0003","1688",'ニ'}
{krow "0437","2851",{on-yomi-only {CharGraphic '丁'}}}
{krow "0009","2854",{CharGraphic '七'}}
{krow "0018","2542",{on-yomi-only 万}}
{krow "0657","2885",'丈'}
{krow "0004","1689",'三'}
{krow "0041","2876",'上'}
{krow "0040","2862",'下'}
{krow "0049","2890",{on-yomi-only 不}}
{krow "0858","2887",'与'}
|| first 12 rows of 'data'

The single-quoted kanji are of type char. At random I have replaced them with CharGraphic instances in either the minimal expression or within the {on-yomi-only } macro that paints some characters RED.




Monday, March 4, 2013

formatting French ebooks


You might think that formatting French ebooks in Curl as web markup would be easy – and it is – but there are still challenges.

If you use the TocDocument as the default document-style there are markup hierarchy issues under each {heading level=n, } macro.

I usually start with simply a default vanilla {paragraph } macro and then move to user defined formats and procedures.

The challenge is to make this accessible to the non-programmer working in a non-Western language.

At this time I am evaluating two rebol language variants (r3 and red) and one Icon variant (Object Icon) for parsing my Curl markup in accordance NOT with Curl syntax but USER preferences.

If all else fails, I will fall back to Logtalk with some Prolog dialect.

In the case of my current French ebook, the rules are to cover non-breaking spaces. The rule of thumb in France is to surround "two-part" punctuation with spaces. The challenge is to address resizeable text frames or panes in which re-flowed text must remain readable.

The minor rules apply to ALL Curl ebooks where the text is in Curl markup (and until we have a webkit wrapper library, that is what I use) such as converting all ASCII single quotes, quotes, psuedo-hypens, brackets and installing matching left- and right- quotation devices.




Wednesday, April 25, 2012

Kanjidic2 as CSV in HTML and text


To aid a neutral party in assessing approaches to digital dictionaries for Japanese, I have posted an HTML file displaying 10,000+ of the first entries in Kanjidic2 at
  http://kanji.aule-browser.com/kanjidic2-m12.html
I have restricted the dump to the Kanji, the UCS code and a max of 12 of a possible 14 meanings.

There are less than 10,200 due to the fact that in the first 12,155 entries, many had no XML meaning content which was not assigned a language attribute.  Those few thousand may have English translations in markup previously used for foreign languages.

The file can be found as
  http://kanji.aule-browser.com/kanjidic2-m12.csv
with a three line header which you may have to alter for your purposes.

The Kanjidic2 XML file was parsed using the Curl XDM library from curl.com (Nihon-go http://www.curlap.com)

As it stands, the HTML file should be useful for building custom Anki flashcards (themselves stored as SQLite.)   I will be using variant CSV output to construct dictionary software with annotations and spaced-repetition options.  Curl has both CSV and SQLite libraries in addition to the XML libraries.