Subject: as promised/threatened Tue Apr 14 19:17:45 1998 here we go.. this is a quick script i wrote a couple weeks ago, before having to take a long plane trip. i'm a voracious reader, and wanted to take some e-books with me, but didn't have any on hand, so i popped over to Project Gutenberg and downloaded a few titles. then i was faced with the problem of scroll through an interminally long file (_A Tramp Abroad_ is *big*), so i decided to split the raw text files into webpages. the result is the script below, which is written for a Mac but should port to unix with little difficulty.. you have to change the filepath separators from ':' to '/', and i think i've gotten most of them, but, as laways, caveat lector. the basic idea is that the script reads a raw text file line-by-line until it has roughly 6K of data, which is a nice 2-3 screen read, IMHO. it pastes that data into a webpage template and stores it in a designated directory. since the files are numbered sequentially, the script automatically generates 'forward' and 'back' links, allowing the reader to click happily from page to page all the way through the book. the script will automatically start a new page whenever it sees a line which begins with the word 'Chapter', and since i'm kinda fussy about my file management, it will also try not to put more than about 50 files in a given directory. if there are more than 50 pages in the current directory, it will create a new directory as soon as it starts the next chapter. you do have to do a bit of tweaking with the output.. the first page has a 'previous' link and the last page has a 'next' link which i was too lazy to code out, but it /is/ a quick & easy way of breaking a humongus file into a nicely linked set of webpages. -- splitter.pl -- &init(); open (SRC, "$SOURCE_FILE") or die qq(can't read "$SOURCE_FILE": $!); while ($para = ) { $para =~ s/^\s+//; next unless ($para =~ /\w/); if (&need_new_page ($para, $SIZE)) { &mk_page (@data); @data = (); $SIZE = 0; } push @data, $para; $SIZE += length ($para); } &mk_page (@data); close SRC; exit (0); #### SUBROUTINES ############################################# sub init { $BASE_DIR = "Macintosh HD:Desktop Folder:downloads:build"; $SOURCE_FILE = $BASE_DIR . ":source.txt"; $tmpl = "Macintosh HD:Desktop Folder:downloads:tmpl.html"; $SEPR = ':'; #### filepath separator.. adjust according to taste. @DIRS = (1,1,1); $DIR = 1; @NAMES = (0,0,1); $SIZE = 0; $COUNT = 0; $PAGES = 1; undef $/; open (TMPL, "$tmpl") or die qq(can't read "$tmpl": $!); $TMPL = ; close TMPL; $/ = "\n\n"; } sub need_new_page { my $line = shift; my $size = shift; if ($size > 6000) { return 1; } if (($size > 0) && ($line =~ /^Chapter/i)) { &ck_new_dir(); return 1; } return 0; } sub ck_new_dir { if ($COUNT > 50) { $NEED_DIR = 1; } } sub mk_page { my @data = @_; my $tmpl = $TMPL; my $text = join ("

\n", @data); my $file = ¤t_file(); my $nxt = &next_url(); my $pre = &prev_url(); $tmpl =~ s/## TEXT ##/$text/; $tmpl =~ s/## BACK ##/$pre/; $tmpl =~ s/## NEXT ##/$nxt/; open (FILE, ">$file") or die qq(can't write "$file": $!); print FILE $tmpl; close FILE; } sub current_file { $COUNT++; $PAGES++; shift @NAMES; push @NAMES, $PAGES; if ($NEED_DIR) { $DIR++; $NEED_DIR = 0; } shift @DIRS; push @DIRS, $DIR; if ($DIRS[0] != $DIRS[1]) { $COUNT = 0; } $dir = sprintf ("%s%s%03d", $BASE_DIR, $SEPR, $DIRS[1]); if ( ! -e $dir) { mkdir ($dir, 0666) or die qq(can't mkdir "$dir": $!); } return sprintf ("%s%s%03d%s%04d.html", $BASE_DIR, $SEPR, $DIRS[1], $SEPR, $NAMES[1]); } sub prev_url { return sprintf ("../%03d/%04d.html", $DIRS[0], $NAMES[0]); } sub next_url { return sprintf ("../%03d/%04d.html", $DIRS[2], $NAMES[2]); } -- EOF -- and here's a sample of the template page for the previous script: -- tmpl.html -- E-BOOK ## TEXT ##


Back | Index | Next Page -- EOF --