Northern Nuts Growers Association logoSeeking more scanable content, Distributed Proofreaders volunteer Marilynda Fraser-Cunliffe approached the Northern Nut Growers Association. Would the growers like DP to digitize the archive of the association’s annual reports? They contain a wealth of information on nut-growing–not just dry statistics.

Marylinda writes at the DP forums: “The committment from the organization is very firm. The Board of Directors is behind me 100% and has given me access to their library copies for digitization purposes. The members are very excited and will be assisting in the proofreading.”

Are any other small organizations interested in DP/Gutenberg people pitching in with digitization? Reach DP at this address. Gutenberg’s normal text and HTML formats might not suffice for your group’s purposes, but even then the “waste product”–the scans, OCRed texts, indexes and the rest–can help your organization grow a digital library.

Meanwhile here is an excerpt from Marilynda’s proposal to the NNGA:

One of the primary functions, if not the primary function, of the NNGA is to share information concerning growing nut trees. That is the whole purpose of publishing an Annual Report.

As time passes, it will be more difficult to obtain print copies of the older reports, without going to the expense of reprinting earlier editions. If the reports are archived on a site, such as PG, the texts will be available for many generations to come. The .html versions will supply the reader with copies of all the accompanying exhibits. The hard work of the NNGA members will be preserved for perpetuity.

NNGA has already taken steps toward digitizing the distribution of their information. On your website, the idea of delivering The Nutshell in a .pdf format is currently being tested.

In addition to managing the production of plain text and html versions of the annual reports, I can also provide the NNGA with digital copies for their own use. These can be .pdf versions of each book, as well as .png (portable network graphics) of each page within a book. These will be provided on a CD-Rom for use as NNGA
sees fit. Some examples of what you might do with these files are:

  • Provide old, out of print, copies to your membership via e-mail.
  • Duplicate the CD-Rom(s) and sell them to your membership. Copies of the Annual Reports can be grouped together on CD-Roms in any combination you wish. Years 1-25, 26-50, etc.
  • Host the .pdf, .png or .tiff versions on your website, although I would suggest that you simply provide a page on the website that links directly to the PG listings for the books, a plain text and html version would be produced for each volume. This will allow you to provide the content, without expending the resources to host the e-books on your Website.
  • Each e-text will have a notification containing the NNGA URL.

    The exact wording can be discussed at a later date, but could read something like this, “This material was provided by the Northern Nut Growers Association, Inc. For additional information on nut-growing, or on the association, please visit http://www.nutgrowing.org.”

16 COMMENTS

  1. The linked document has that information: “The cost to archive your Annual Reports is minimal. As mentioned earlier, there is no cost to produce the books through DP, nor is there a cost associated with hosting the finished projects on PG.”

    The true compensation for DP is that we get more books to process. 🙂

  2. The only cost to the NNGA is either a complete set of their Annual Reports for destructive scanning purposes, or access to their library copies. Since some of the Reports are dwindling in numbers, the association has opted for option 2.

    I will be supplying a Master CD-Rom of the page scans (in pdf format) of the first 50 Reports for a fundraising initiative. I will be suggesting at that time that a small portion of the proceeds could be donated to PGLAF, but no mention of this has been made yet, and there is no obligation to do so.

    ~Marilynda

    P.S. Thanks, Branko for sharing this with everyone on Teleread. I’m very excited about this project and overwhelmed by the positive response I received from the NNGA’s Board and Members.

  3. This is great, Marilynda. When do the digitization efforts start? I hope you’ll keep us posted via Branko, this comment area and otherwise. The NNGA folks are welcome to join in with their own comments. Should be fun to track the progress of the project.

    If you want, you can even use this comment area as a diary, while keeping the rights to republish the material elsewhere. I think it would be cool to do a chatty narrative and explain in plain English how everything is done. You could start with an informal roadmap.

    Your call. But the invitation is open to you if you’re game, and, as noted, the NNGA folks can add their own perspective.

    Thanks to you and Branko! I hope he has many similar items to report. Meanwhile, maybe you can start by telling people about the importance of the information to be digitized–or perhaps the NNGA folks can, fi they’re interested. I know the proposal touches on this. But I’d like a broader context to show the importance of this kind of nut-growing–both economically and socially.

    Good luck,
    David

  4. Thanks, David. I have been working on this since late last year, when I first approached PG about the technicalities of copyright clearance. In Janury 2005, I submitted the proposal electronically to the NNGA Board and made a full presentation to the Board and Members last Sunday (July 31, 2005) at their Annual Meeting.

    For those who are not familiar with the Distributed Proofreaders model, I will supply a brief description of the process.

    1 – A content provider obtains material to be proofread. The Title Page and Obverso are submitted to Project Gutenberg for a copyright clearance. If the book is in the Public Domain, it is scanned and OCR’d. Additional pre-processing is done to correct some of the more common OCR errors (scannos) and the PNGs (scanned page images), TXT and JPEG (illustrations or diagrams) are uploaded to the PGDP server.

    2 – The Project Manager creates the project by transferring the files into a common project file. Project Comments are written highlighting some of the unique features of the book, as well as any difficulties the proofreaders or formatters may encounter. The Project Comments can also include interesting information about the book, its author and anything else the Project Manager volunteers might find interesting.

    3 – Each project has a Discussion Thread, where volunteers can ask questions or report findings that will assist other volunteers. This thread can be accessed either from the Project Comments page, or from the PGDP Forum link.

    4 – Each page is proofread twice, then formatted twice (tags are added for italics, bold, illustrations, etc., and tables are lined up correctly).

    5 – After 4 different pairs of eyes look at each page of the book, the text files are cocatinated and a Post-Processor massages the proofed and formatted files into a single e-book. The entire book is spell checked and formatting (etc) is checked for consistency. The Post-Processor will also check again for any punctuation or scanno errors. The Annual Reports will be turned into a plain text file, as well as an HTML file (if the Post-processor chooses, they can pass the HTML task over to the HTML Pool).

    6 – The Post-Processor can also opt to release the book to the Smooth Reading Pool. At this stage, volunteers read the book as they would any other book. If anything “pops” out as questionable, the Smooth Reader will add a comment/query to the text file and return it back to the Post-Processor.

    7 – Once the Post-Processor is satisfied with the e-book, they return it to the PGDP site for Post-Process Verfication, where a more experienced Post-Processor will verify that the project will meet PG standards. If everything looks right, the e-book is submitted to PG for posting in the PG Digital Archives. If additional work is required, the PPV will either make the necessary corrections, or pass the book back to the Post-Processor.

    8 – The PG White Washers will take another look at the book to ensure that everything meets the PG standards and the book is posted.

    I currently have 5 Annual Reports ready to go. I’m just waiting for the final Copyright Clearance from PG. I expect that the proofreading on PGDP will commence in two-three weeks. There are over 50 reports that are currently in the public domain and the Board may elect to grant distribution rights to PG for those that are still protected. This project will be on-going at PGDP for some time, at least a couple of years.

    As many of the NNGA volunteers are new to the PGDP proofreading model, I will be creating a special team thread to make them feel more at home. I will add a link to this blog and encourage them to also share their experiences here. It will be interesting to read their perspective.

    As for the Reports themselves, they are the collection of nearly 100 years of nut-growers’ and researchers’ experiences. They offer information on what has worked in particular geographic regions, and what has not. The papers cover a wide variety of trees and topics, from how to get an orchard started, to what to do with the nuts. Bugs, pests and diseases are discussed, as well as general observations. The NNGA has a lot to offer in the way of experience and many of the nut-related experts in the fields of agriculture, silviculture, horticulture and entomology are members of the organization.

    I think it is best to leave any additional description to one of the members of the organization. (I may be able to coax my husband into posting a more in depth description. He is the president of the Society on Ontario Nut Growers, a local organization that is closely affiliated with the NNGA.)

    ~Marilynda

  5. Many thanks for the informative report, Marilynda, and I hope that your husband and NNGA folks can follow through. No length limit. This thread can go on as long as people want. I’d love to see it last the length of the project, assuming the interest is there! If nothing else, I’ll be interested in how the digitized ARs end up being used. Good luck! – David

  6. Progress Report: I now have 7 Annual Reports ready for proofreading. I will be sending an invitation/introductory e-mail today or tomorrow (I am awaiting feedback on the draft from the two people I sent it to for review). I will be releasing the first Annual Report to the proofreading rounds on October 15, 2005, to allow some time for the BBGA members to become Distributed Proofreaders members.

    I also have 13 other volumes beside my scanner, waiting for scanning.

    In total, there will be 77 volumes produced through Distributed Proofreaders for inclusion in the Project Gutenberg Archives.

    A Team Forum has been created at DP, to provide a special place for the NNGA members (and others who work on the Annual Reports) to chit-chat. If anyone else uses this model to reach out to special interest groups, this might be something they would like to incorporate into their plans. This will provide a less intimidating place for new proofreaders to ask questions and become acquainted with Message Boards. Also, if the members of the team elect to have email notification of posts, it serves as an efficient way for the Project Manager to distribute announcements to everyone interested in the project (my initial e-mail is going out to 60+ individuals, future announcements will be as easy as posting a single post to the thread).

    I have extended an invitation to the NNGA members to use this blog as a journal of their experience. I’m looking forward to reading their thoughts on the Distributed Proofreaders process, as well as what this project means to them.

    More when I have more to add.

    ~Marilynda

  7. The trees discussed in these Annual Reports are potentially a tremendous resource for the whole planet. They have been neglected and abused by modern civilization and information about them is difficult to come by. These reports are one of the few good sources of datum on the propagation, physiology and culture of Nut Trees.

    Just last weekend, the individual who loaned some of the Annual Reports to my wife for scanning needed two of them back to loan to another individual for research purposes.

    I could go on on the topic of our modern disregard for native knowledge and horticultural practices, but that is off topic for this blog.

    p.s. I proofread my first page, and it was pretty painless.

  8. I read your response to Chris (Muddy_Nut of DP). He laughed and said that is it impossible for him to not think of books already. Any way he turns his head in our house, he is likely to see books, especially in the library/computer room.

    I made a vital error in deciding when to release the first Annual Report. I wanted to wait until I had a few (10) volumes ready so there would be a steady flow of material into the first proofreading round, however, I happened to release them at the height of the fall harvest.

    A note for anyone else who might decide to do similar work with an organization such as the NNGA … also keep busy or special times of the year in mind when preparing the material. If you don’t, you may find that the “build it and they will come” philosophy might not pan out immediately.

  9. I sent an announcement to the Board of Directors at the NNGA on Friday. The first “Nut-Culture” book has been posted to Project Gutenberg. Although this is not one of the Annual Reports, it is a book that was written by a past president.

    “Nut Growing in the North: A Personal Story of the Author’s Experience of 33 Years with Nut Culture in Minnesota and Wisconsin” by Carl Weschcke is now available in the Project Gutenberg archives. The direct URL to the files is: http://www.gutenberg.org/etext/18189 and is available in HTML and Plain Text (Latin-1 and iso-8859-1 character sets).

    I am very excited about this posting. (Almost as excited by my first posting to Project Gutenberg, or when a series of my books was selected as the 15,000th book on PG). I described my excitement to the NNGA Board as being akin to having one of their first trees bear its first crop.

    The first NNGA Annual Report to go through the Distributed Proofreader process (from 1915) is in its final stage before being posted (Post-Processing Verification). With any luck, it will be finding its new home in the PG archives shortly. Six others are well on their way.

    ~Marilynda

  10. As requested, here’s the full text of the e-mail announcing the posting of arl Weschcke’s book. Carl is a Past-President of the organization.

    First Nut Related Book Posted to Project Gutenberg!

    On April 17, 2006, the first nut-related book finished its journey through the rounds of proofreading and formatting and found a new home in the archives of Project Gutenberg. The book (and author) in question is one I’m sure you are all familiar with. “Nut Growing in the North: A Personal Story of the Author’s Experience of 33 Years with Nut Culture in Minnesota and Wisconsin” by Carl Weschcke is now available in the Project Gutenberg archives. The direct URL to the files is: http://www.gutenberg.org/etext/18189 and is available in HTML and Plain Text (Latin-1 and iso-8859-1 character sets).

    To date, there has been great progress in the “Nut Culture” Project on Distributed Proofreaders. It is always a concern for a Content Provider, who introduces a new subject matter, that there will not be sufficient interest in the material. This has not been the case. Although the books are moving slower than anticipated, this is likely due to several changes that were made to the process in the past year. Instead of two rounds of combined Proofreading and Formatting, we have moved to two rounds dedicated to Proofreading and two additional rounds dedicated to Formatting. This change was made in July 2005. Since then, there has also been a change in the Guidelines for Proofreaders and Formatters, including a reassignment of some of the duties. It is understandable that books are now taking twice as long as they once did to reach Project Gutenberg. It is our hopes that these books are also that much nearer perfection than they once were. As some of you are aware, the folks at Distributed Proofreaders are quite detail oriented. Some of the proofreaders will give pause when they notice something out of the ordinary (even in subject matter they are unfamiliar with) and take a few moments to research their concern. In the past couple of weeks, we have come across a name that was spelled various ways (on the same page!), as well as ligature errors in Latin botanical names ([ae] vs. [oe]). I truly believe that the changes in the process have impacted the accuracy in a positive manner.

    To give you an idea of the progress that has been made, here’s a list of the botanical books that are currently in progress:

    Books that Have Completed Proofreading and Formatting. Post-Processors are now converting the individual pages back into a single file for posting to PG:
    – NNGA Annual Report 1915 – has been post-processed. It is now being verified (final check) before being posted.
    – Sylva, or, A Discourse of Forest Trees by John Evelyn (originally published in the 1600s, this is a 1908 reprint)

    Formatting – These books have been proofread twice and are now ready to have the special formatting inserted (such as italics and Small Caps)
    – Elements of Structural and Systematic Botany by Douglas H. Campbell
    – NNGA Annual Report 1917

    Proofreading – These books are somewhere in the process of being Proofread (either in P1, are waiting for release to P2, or are in P2)
    – NNGA Annual Reports 1919, 1921, 1930, 1933 and 1951
    – Walnut Growing in Oregon by Jacob Calvin Cooper (1910)
    – The Pecan and Its Culture by H. Harold Hume (1906)
    – English Walnuts by Walter Fox Allen (1912)
    – The Genus Pinus by George Russell Shaw (1914)

    The following books will appear in the Proofreading Rounds soon:
    – NNGA Annual Reports 1914, 1920, 1934 and 1943 (are almost ready for proofreading)
    – Other NNGA Annual Reports, up to 1963 will be scanned and pre-processed. These have already received Copyright Clearance from Project Gutenberg
    – Annual Reports from 1964-1988 are also in the Public Domain. I will submit these for Copyright Clearance when we near the end of the earlier material, or when it becomes more difficult to ensure a steady stream of material into the Proofreading Rounds.

The TeleRead community values your civil and thoughtful comments. We use a cache, so expect a delay. Problems? E-mail newteleread@gmail.com.