Rough wind, that moanest loud
Grief too sad for song;
Wild wind, when sullen cloud
Knells all the night long;
Sad storm, whose tears are vain,
Bare woods, whose branches strain,
Deep caves and dreary main, -
Wail, for the world's wrong!
据此,某部西班牙语的遗作可以写为:
scheme = "rfc1766" content = "es">
lang = "es" content = "La Mesa Verde y la Silla Roja">
lang = "en" content = "The Green Table and the Red Chair">
content = "1935">
content = "1939"> 需要注意,本文例中所使用的修饰符语法和标签后缀(跟在元素名和点号后)仅仅反映了修饰符HTML编码的趋势,这种语法格式和后缀既非标准也不建议这么做。 7、DC元素编码 这一节针对不同的DC元素举出了相应的应用实例: Title (给出的资源名) -----
lang = "en" content = "The Author gives some Account of Himself and Family -- His First Inducements to Travel -- He is Shipwrecked, and Swims for his Life -- Gets safe on Shore in the Country of Lilliput -- Is made a Prisoner, and carried up the Country">
content = "A tutorial and reference manual for Java.">
content = "Seated family of five, coconut trees to the left, sailboats moored off sandy beach to the right, with volcano in the background.">
#!/depot/bin/perl # # This simple perl script extracts metadata embedded in an HTML file # and outputs it in an alternate format. Issues warning about missing # element name or value. # # Handles mixed case tags and attribute values, one per line or spanning # several lines. Also handles a quoted string spanning multiple lines. # No error checking. Does not tolerate more than one " print "@(urc;\n"; while (<>) { next if (! /
($meta) = /( if (! //i) { while (<>) { $meta .= $_; last if (/>/); } } $name = $meta =~ /name\s*=\s*"([^"]*)"/i ? $1 : "MISSING ELEMENT NAME"; $content = $meta =~ /content\s*=\s*"([^"]*)"/i ? $1 : "MISSING ELEMENT VALUE"; ($scheme) = $meta =~ /scheme\s*=\s*"([^"]*)"/i; ($lang) = $meta =~ /lang\s*=\s*"([^"]*)"/i;
if ($lang || $scheme) { $mod = " ($lang"; if (! $scheme)
content = "Simpson, Homer">
content = "(--mbtitle)">
content = "(--mbfilemodtime)">
content = "(--mbbaseURL)/(--mbfilename)">
content = "text/html; (--mbfilesize)">
content = "(--mblanguage)-BUREAUCRATESE">
content = "Springfield Nuclear"> href = "http://purl.org/DC/elements/1.0/"> href = "http://nukes.org/ReactorCore/rc"> 只要把其中的变量引用代入实际值,上面的模板就可作为描述文档的元数据块。 根据我们的脚本,下述变量要同时在模板和文档中替换: (--mbfilesize) size of the final output file (--mbtitle) title of the document (--mblanguage) language of the document (--mbbaseURL) beginning part of document identifier (--mbfilename) last part (minus .html) of identifier (--mbfilemodtime) last modification date of the document 这是一个应用该脚本的HTML文档:
content = "Memorandum">
From: Acting Shift Supervisor To: Plant Control Personnel RE: (--mbtitle) Date: (--mbfilemodtime)
Pursuant to directive DOH:10.2001/405aec of article B-2022, subsection 48.2.4.4.1c regarding staff morale and employee productivity standards, the current allocation of doughnut acquisition funds shall be increased effective immediately.
From: Acting Shift Supervisor To: Plant Control Personnel RE: Nutritional Allocation Increase Date: 1999-03-08
Pursuant to directive DOH:10.2001/405aec of article B-2022, subsection 48.2.4.4.1c regarding staff morale and employee productivity standards, the current allocation of doughnut acquisition funds shall be increased effective immediately.
下面是完成这一转换过程的脚本: #!/depot/bin/perl # # This Perl script processes metadata block declarations of the form # and variable references of the # form (--mbVARNAME), replacing them with full metadata blocks and # variable values, respectively. Requires a "template" file. # Outputs an HTML file. # # Invoke this script with a single filename argument, "foo". It creates # an output file "foo.html" using a temporary working file "foo.work". # The size of foo.work is measured after variable replacement, and is # later inserted into the file in such a way that the file's size does # not change in the process. Has little or no error checking.
$infile = shift; open(IN, "< $infile") or die("Could not open input file \"$infile\""); $workfile = "$infile.work"; unlink($workfile); open(WORK, "+> $workfile") or die("Could not open work file \"$workfile\"");
@offsets = (); # records locations for late size replacement $title = ""; # gets the title during metablock processing $language = "en"; # pre-set language here (not in the template) $baseURL = "http://moes.bar.com/doh"; # pre-set base URL here also $filename = "$infile.html"; # final output filename $filesize = "(--mbfilesize)"; # replaced late (separate pass)
sub putout { # outputs current line with variable replacement if (! /\(--mb/) { print WORK; return; } if (/\(--mbfilesize\)/) # remember where it was { push @offsets, tell WORK; } # but don't replace yet s/\(--mbtitle\)/$title/g; s/\(--mblanguage\)/$language/g; s/\(--mbbaseURL\)/$baseURL/g; s/\(--mbfilename\)/$filename/g; s/\(--mbfilemodtime\)/$filemodtime/g; print WORK; }
while () { # main loop for input file if (! /(.*)(.*)//) { $remainder = $1; } else { while () { $title .= $_; last if (/(.*)\s*-->(.*)/); } $title .= $1; $remainder = $2; } open(TPLATE, "< template") or die("Could not open template file"); while () # subloop for template file { &putout; } close(TPLATE); $_ = $remainder; &putout;
} close(IN);
# Now replace filesize variables without altering total byte count. select( (select(WORK), $| = 1) [0] ); # first flush output so we if (($size = -s WORK) < 100000) # can get final file size { $scale = 0; } # and set scale factor or else { # compute it, keeping width of size field low for ($scale = 0; $size >= 1000; $scale++) { $size /= 1024; } } $filesize = sprintf "%7.7s %sbytes", $size, (" ", "K", "M", "G", "T", "P") [$scale];
foreach $pos (@offsets) { # loop through saved size locations seek WORK, $pos, 0; # read the line found there $_ = ; # $filesize must be exactly as wide as "(--mbfilesize)" s/\(--mbfilesize\)/$filesize/g; seek WORK, $pos, 0; # rewrite it with replacement print WORK; }
close(WORK); rename($workfile, "$filename") or die("Could not rename \"$workfile\" to \"$filename\""); # ---- end of Perl script ---- 10. 作者地址 John A. Kunze Center for Knowledge Management University of California, San Francisco 530 Parnassus Ave, Box 0840 San Francisco, CA 94143-0840, USA Fax: +1 415-476-4653 EMail: jak@ckm.ucsf.edu 11、参考资料 [AAT]Art and Architecture Thesaurus, Getty Information Institute. http://shiva.pub.getty.edu/aat_browser/ [AC]The A-Core: Metadata about Content Metadata, (inprogress) http://metadata.net/ac/draft-iannella-admin-01.txt [DC1]Weibel, S., Kunze, J., Lagoze, C. and M. Wolf,"Dublin Core Metadata for Resource Discovery", RFC2413, September 1998. ftp://ftp.isi.edu/in-notes/rfc2413.txt [DCHOME]Dublin Core Initiative Home Page. http://purl.org/DC/ [DCPROJECTS]Projects Using Dublin Core Metadata. http://purl.org/DC/projects/index.htm [DCT1]Dublin Core Type List 1, DC Type Working Group, March 1999. http://www.loc.gov/marc/typelist.html [freeWAIS-sf2.0] The enhanced freeWAIS distribution, February 1999. http://ls6-www.cs.uni-dortmund.de/ir/projects/freeWAIS-sf/ [GLIMPSE]Glimpse Home Page. http://glimpse.cs.arizona.edu/ [HARVEST]Harvest Web Indexing. http://www.tardis.ed.ac.uk/harvest/ [HTML4.0]Hypertext Markup Language 4.0 Specification, April 1998. http://www.w3.org/TR/REC-html40/ [ISEARCH]Isearch Resources Page. http://www.etymon.com/Isearch/ [ISO639-2]Code for the representation of names of languages, 1996. http://www.indigo.ie/egt/standards/iso639/iso639-2-en.html [ISO8601]ISO 8601:1988(E), Data elements and interchange formats -- Information interchange - Representation of dates and times, International Organization for standardization, June 1988. http://www.iso.ch/markete/8601.pdf [MARC]USMARC Format for Bibliographic Data, US Library of Congress. http://lcweb.loc.gov/marc/marc.html [PERL]L. Wall, T. Christiansen, R. Schwartz, Programming Perl, Second Edition, O'Reilly, 1996. [RDF]Resource Description Framework Model and Syntax Specification, February 1999. http://www.w3.org/TR/REC-rdf-syntax/ [RFC1766]Alvestrand, H., "Tags for the Identification of Languages", RFC1766, March 1996. ftp://ftp.isi.edu/in-notes/rfc1766.txt [SWISH-E]Simple Web Indexing System for Humans - Enhanced. http://sunsite.Berkeley.EDU/SWISH-E/ [TGN]Thesaurus of Geographic Names, Getty Information Institute. http://shiva.pub.getty.edu/tgn_browser/ [WTN8601]W3C Technical Note - Profile of ISO 8601 Date and Time Formats. http://www.w3.org/TR/NOTE-datetime [XML]Extensible Markup Language (XML). http://www.w3.org/TR/REC-xml 12、版权声明 Copyright (C) The Internet Society (1999). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Acknowledgement Funding for the RFCEditor function is currently provided by the Internet Society.