Pages Navigation Menu

Digital Archiving and Information Services

ERecsDay and Electronic Records Day 2019

Posted by on Oct 10, 2019 in Blog, Portfolio | Comments Off on ERecsDay and Electronic Records Day 2019

Today, 10/10/2019, is ERecsDay and Electronic Records Day.  As part of the growing awareness of the challenges of managing and archiving electronic records, the State Council of Archivists commemorates October 10 as Electronic Records Day.  In sum, for those of us who have worked in digital archiving over the last decade, it’s a great relief that we do this each year.  Of course, most of the good stuff happens in bit-sized tweets on twitter, but it’s not all ephemeral!

ErecsDay and Electronic Records DaySo, this year for ERecsDay and Electronic Records Day, we’ve been tracking some of our favorite tweets on twitter.  In fact, this year we interacted with a archivists at a variety of institutions including the National Archives, the Council on State Archivists, and the Library of Congress.  Here’s some resources including quizzes and news stories.

 

#ERecsDay TAKEAWAYS

  1. There is a “Did you know: Some Interesting Facts About Digital Media” quiz from the Library of Congress . Though it’s short, it serves as a good baseline and shows how archival knowledge and insights affect our daily lives. . . . So check it out!
  2. Every year, the Council of State Archivists shares some of their excellent resources.  Check them out here!
  3. 10 reasons why electronic records need special attention PDF” at CoSA
  4. National Archives provided answers about recordings and/or transcripts from different White Houses. They also provided links to the Nixon Presidential Library and to the Lyndon B. Johnson Presidential Library.
  5. Preservica announces the release of a “The State of the State” report on State Archives in the US for the Council of State Archivists PDF.

All of these resources provide critical information related to electronic records and digital archiving!
If you have any questions or need some assistance, check out our site of past clients or our services.

Remember, it is very, very, very, very, very, very very important.

Read More

Impeachment Inquiry: Whistleblower Complaint (pt 3)

Posted by on Oct 3, 2019 in Blog | Comments Off on Impeachment Inquiry: Whistleblower Complaint (pt 3)

As we join the nation in watching how the  Impeachment Inquiry Whistleblower Complaint plays out, we have ideated and designed a prototype Digital Archivy scorecard for informational appraisal.  The Digital Archivy Scorecard grades based on Assessment, Identity, Description, Priority, and Security Classification.

In this way, we can determine the value of content based on provenance, function, significance and accuracy.  With that in mind, we will look at another piece of critical evidence: The Whistleblower Complaint.   This will allow us to assess the accuracy and trustworthiness of the different data inputs that will be examined over the course of the next few weeks.

IMPEACHMENT INQUIRY: WHISTLE-BLOWER COMPLAINT

The whistleblower filed his/her 9-page Whistleblower Complaint after the phone call between President Trump and Ukraininan President Zelensky. It is from August 12, 2019, and though it is unclassified, it has significant redactions.  Further, though it is in PDF format, it is not text-searchable.

As we analyze the source, we examine its relevance and the provenance to gain a fuller understanding of its import.  With this in mind, we gave significantly different scores for the Whistleblower complaint compared with previous blog entry (“Digital Archivy Scorecard on Information Appraisal (part 2)“).

In large part, this is due to the fact that we are confident in the identity of the sole author. We understand his perspective, and believe the accuracy and likelihood of his first-hand evidence. The clear language and thorough descriptions are all positive and could be used to support other sources. However, there are questions related to the document’s authenticity, provenance and chain of custody.  Because there are redactions due to sensitive intelligence issues, the Description score suffers.   This is a critical concern because accusations of a “mafia-style shake-down” are urgent and quite serious.

The priority of this source of information is very high, but it gets a B in Security Classification because parts of the complaint are redacted. This obscures and affects the complaint itself. Consequently, it also may change the meaning or message of the information itself.

Whistleblower Complaint

 

 

 

 

 

However, on a whole, the Whistleblower Complaint is B-grade material.  This information source is high-priority.

Stay tuned for Part 4.
Check out Part 1 on the Information Appraisal scorecard here.

 

Read More

Digital Archivy Scorecard for Information Appraisal (pt 2)

Posted by on Sep 30, 2019 in Blog, Portfolio | Comments Off on Digital Archivy Scorecard for Information Appraisal (pt 2)

Digital Archivy Scorecard on Information Appraisal, part 2

As we join the nation in watching how the #ImpeachmentInquiry plays out, it is an excellent time for us to design a prototype Digital Archivy scorecard for informational appraisal.  The Digital Archivy Scorecard will provide grades based on Assessment, Identity, Description, Priority, and Security Classification.

In this way, we can determine the value of content based on provenance, function, significance and accuracy.  With that in mind, today we will look at one piece of critical evidence: The Transcript of July 25 phone call between President Trump and Ukraininan President Zelensky.  The 9-page transcript of the conversation has  disappeared from public. It was replaced by the 5-page Memo of Conversation prepared and released by the White House. This piece focuses on the “idea” of the original transcript.

Digital Archivy Scorecard inputsTHE TRANSCRIPT

The transcript is a record of the phone call. As far as we know, there is no audio recording of it. There is a chance, of course, that the Ukrainian Government made an audio recording. Assuming there is no recording, this transcript was created by 12 employees who listen in on the call and jot down notes. Later these notes are compiled and combined by somebody, and then they are used to re-create the transcript. There is no guarantee that the final version is the most accurate representation.

You can see from the scores below, that it is lacking as a trusted source of information. By our score card, we give it straight C’s in five categories: Assessment; Identification; Description; Assign Priority; and Security Classification.

In the first stage, Assessment, we grade as a C because we cannot confirm provenance with regard to authorship. It may be collaborative, but it contains spelling errors (misspelling they’re as there). It also loses points due to the fact that editorial changes were made prior to public release.

Identification gets a B. This is not a unique (“smoking gun”) transcript, but its authorship is clear.  Rather, it is one of many different calls between Trump and Zelinsky. There were additional conversations between Zelinsky and Trump, Mike Pence, Rudy Giuliani, and probably others.

 

Scorecard Descriptions

We give a C to Describe because the content itself is not complete. There are, for example, a number of sentences that contain ellipses. This indicates an incomplete transcript. Without knowledge of the call’s duration, the subject matter, or even the number of participants, we can not trust that this 5-page transcript is complete and accurate.

We grade Priority on this information source as a D.  It is not authoritative and may serve other purposes.  Also, it loses data integrity because it is an interpretation of an aural phone call. It is in a written format. This is key. Since we do not have access to additional supporting materials (e.g., complete notes or an audio recording) yet, this is a non-trusted source.

We give Security Classification an F. This document was declassified and semi-redacted and clearly serves political issues. In fact, the redactions serve to undermine the authority of the message.  We cannot look at it as an unvarnished truth. Also, it is an interpretation of one of many phone calls in which the US Administration asked for a favor and withheld funds promised and approved by the Congress.

 

Conclusion

Read more about the transcript itself from The Washington Post’s. However, the article focuses more on the preparation of the MemCon (Memorandum of Conversation). The Post also warn us on its value: “Don’t rely on whatever transcript is released,” said a former staffer, who spoke on the condition of anonymity to comment candidly. “Even if it’s unredacted; those transcripts are heavily edited by political leadership at NSC. I’ve seen substance deleted from these call ‘transcripts’ to delete either superfluous details or more substance.”  Here’s an article from Quartz that addresses the “transcript” described herein.  They state it is a full and unredacted, but it is “not a verbatim transcript of a discussion.”

Find out more about our clients and work.
Check out Part 1 on the Information Appraisal Scorecard here.

Stay tuned for Part 3 of this blog series.

Read More

The Information Appraisal Scorecard (pt 1)

Posted by on Sep 26, 2019 in Blog | Comments Off on The Information Appraisal Scorecard (pt 1)

Digital Archivy has developed an Information Appraisal Scorecard to assist clients. This blog entry is related to one practical modern-day example.

information appraisal scorecard

Many institutions including, evidently, the White House, face significant cyber challenges. First of all, they struggle to create effective information systems. They also struggle in implementing efficient workflows.  So, with some awareness of needs, they would benefit from an information appraisal scorecard. Most notably, this helps capture and codify provenance, metadata, and value. Ultimately, it will also assist in measuring users and usage with metrics.

From the perspective of digital archivy and information management and other records management nerds, the Impeachment Inquiry will be riveting. It will also be an enlightening experience.

So, with this in mind, we address the extant evidence using information management practices– while the case unfolds before our eyes.  As evidence is uncovered and de-classified and made available, we will appraise sources, assess the content, assign values. We aim to evaluate content to create a better information ecosystem. And we will build a system that tracks the evidential and informational value.

 

INFORMATION APPRAISAL SCORECARD CRITERIA

Because of this, we look at ways to quantify the data. We developed our scorecard to simplify this objective work. We began to look at the foundation, content, and data streams. Here are a few of our criteria:

  • assess the media format or the source materials
  • identify the content type
  • describe the content itself
  • assign the level of priority
  • classify security levels as needed

information appraisal scorecardWe build on our experience appraising archives and advising on records retention schedules. As a result, we apply critical thinking and problem-solving. By examining and evaluating the information and evidence, we compare scores in the scorecard. This is useful for best practice.  In addition, this helps us assign an information appraisal value that will

  • determine if evidential or informational value
  • consider the volume and quantity or frequency of data and digital assets
  • evaluate the uniqueness, authenticity, accuracy and completeness
  • assess complexity of data and information relevant to users and usage
  • apply access restrictions and user permissions

As a consequence, and by using this criteria as baseline, we create a system to track values based on source and content and users of the documents.

 

 

IMPEACHMENT INQUIRY

Similarly, in advance of today’s (September 26) #ImpeachmentInquiry hearing, records were released.   Consequently, they have different informational value. Further, in some cases, they have evidential value as well.  It includes a variety of content and materials.  Specifically:

  • Transcript of phone call
  • Whistleblower complaint
  • Testimony of DNI
  • ICIG’s Letter
  • White House Memo of Conversation
  • Record and notes from phone call by observers; and
  • Audio recording of phone calls/conversations between POTUS and Ukrainian President

 

information appraisal scorecard

 

In addition, as we look at the network of connections to the left, we can map a new valuation.  Next time we will introduce the information appraisal Scorecard. We will also show a sample with the July 25 Phone Call Transcript. Furthermore,  you can access a copy of the IC IG’s letter here.

Above all, check out examples of some of our client work .  They illustrate how we develop similar systems to create systems and metrics.

Read More

“The ebooks will stop working.”

Posted by on Jul 3, 2019 in Blog, Featured | Comments Off on “The ebooks will stop working.”

“The ebooks will stop working.” I gasped when I first read that.
Last week, I was doing some research and surfing twitter on my cell when I saw a spine-tingling tweet with a sentence that made me stop in my tracks:  “The books will stop working.” After re-gaining consciousness, I stewed and mulled and collected my thoughts.  Then I wrote my most popular tweet ever. It practically went viral!  It got more than 31,000 click-thru engagements and nearly 150 likes and 130 retweets.  I’m still recovering. . . .

What a horrifying sentence, I thought.  As someone who has worked with digital libraries and archives, I’ve been thinking about ebooks not working and about digital rights for a long time. This disturbing news was gratifying and reminded me of the inspiring science fiction writings and non-fiction articles by Cory Doctorow.

 

DRM

So here’s the story.  In April, Microsoft announced that they will shut down their ebooks store and grant refunds. The FAQ page states “your books will be removed from Microsoft Edge when Microsoft processes the refunds.” In other words, they will refund purchases and then remove their DRM books.  They are eliminating their whole e-book ecosystem!

In 2004, science fiction writer Cory Doctorow warned us about ebooks. His speech is still available freely online (through the public domain) in many formats, I can still remember the thrust of his argument:

  1. DRM systems don’t work.
  2. DRM is bad for society.
  3. It is bad for business.
  4. It is  bad for artists.
  5. DRM is bad business for Microsoft.

Originally intended as an anti-piracy measure, DRM has changed. Now it primarily functions to lock customers into a specific ecosystem. This restricts our ability to read or view or listen to purchases wherever and however we want. This cycle has persisted for decades and it shows no signs of abating.

However, at present we may not have all the information nor terminology to discuss this vexing issue in pragmatic terms. In some ways this shares similarities with technological or format obsolescence issues.  DRM is more complex because it represents a time-based and privately-owned security mechanism.

Obsolescence Learned

Technological obsolescence often furthers a carrier format improvement. For example, the sound quality recorded on analog vinyl LPs were improved by digital re-engineering and noise reduction. Digital optical CDs captured recordings that sound cleaner and, some say, more sterile. But the market was built upon persuasive promises that compact discs provide “perfect sound forever.”

In the case of DRM, though, there is a more appropriate example of format obsolescence. This one hits closer to home in the born-digital world.  In the late 1990s, Macromedia Flash became the new killer app. It was a frame-by-frame animation tool that simplified vector animation and interactive publishing for the web.  The software enabled users to create interactive animations on a timeline and to capture and upload moving images files. This was a huge improvement on static HTML pages of yesteryear!

The ebooks: After-Flash Math

Adobe bought Macromedia and Flash in 2006.  Things got rough for Flash when the iPhone was released in 2007 without Flash player support.  Then, a few years later, just prior to releasing the iPad, Steve Jobs stuck a dagger when he announced Apple was stopping support for Flash. This exclusion of the Apple ecosystem was deadly for Flash.

With advances in open standards and exclusion from Apple ecosystem, Flash languished. Its software usage and proprietary formats became less ubiquitous. HTML5, a new open standard, became the go-to replacement.  YouTube, the largest provider of Flash video, was one of the first to migrate content to the new standard. In July 2017, Adobe announced they too would end support for Flash player and software in 2020.

Digital Rights Management represents a new structural threat to accessing our content.  This example of Microsoft removing ebooks and evidence of books from people’s libraries, is horrifying.  This is the tyranny of DRM and it illustrates the grave threat of additional proprietary algorithms.

So next time you’re downloading some music or purchasing an ePub, you may want to ask yourself, “Will the ebooks stop working too?”

Check out the Cory Doctorow speech from 2004 on Craphound: https://craphound.com/msftdrm.txt
Or if you prefer a fancier format, check it out here in glorious PDF.

Read More

”Content Is King and Storage Is Cheap”

Posted by on Mar 29, 2019 in Blog | Comments Off on ”Content Is King and Storage Is Cheap”

We often hear “content is king” and “data storage is cheap.” But few will point out how difficult it is to identify and separate content from data.  Content include Intellectual Property. But it also may include emails or text messages. Though content may be Top Dawg, long-term preservation is expensive. Formats, technical requirements and time frame affect storage costs and strategy. Yes, storage is cheap now, but over time, it becomes expensive and costs add up quickly.

Appraising data prior to ingest is invaluable. This builds trust in a system that users can use and find assets.  Storage is cheaper than cheap when less data is stored.

Content Is King

Content lies in the Venn Diagram intersection sweet spot of People, Process and Technology.  As a result, technology people, and process create content.  Workflows can be well-defined if challenges are phased. Also, solutions must be  multi-dimensional. The question becomes, how do creators and users find, access, and share the resources they need?Content Venn Diagram

To some, it is unreasonably obtuse to invest in an organization’s organization.  But many gradually will see that great value is added with a single unified system.  One single source of truth may become a trusted digital repository. If costs are shared, and if it serves many users, an information infrastructure iscost-effective. Armed with content they need, users are empowered to seek, use, need, and share information.

Storage Is Cheap

Each network supports a wide network of extraordinary groups and unique individuals. While accessing information, users employ different processes and recognize their different needs and objectives. These include:

  • Data for documentation of records
  • Intellectual Property
  • Sensitive or confidential data with PII
  • Information licensed for limited usage
  • Internal and external resources used for reference purposes
  • Evidential and analytical data from reports and projects
  • Communications
  • Promotional and marketing materials

Information creates and adds value to the network. It frames every element of access, need, and use. An understanding of content usage also benefits users.  An effective metadata schema will build a sound infrastructure.  With guidance, a ttaxonomy can frame knowledge due to internal resources. Controlled vocabularies, preferred terms and acronyms, compiled from a style guide will fortify a trusted system and help with user adoption.

Consequently, we build effective and efficient information management systems. Institutions migrate structured and unstructured data to the Cloud. Sooner or later, they will need a strategy. Without a plan, they practically guarantee they will misunderstand their institutional knowledge.

Content is King, but context is queen and metadata is a prince.

Check out my experience from a list of clients with whom I’ve worked: http://www.digitalarchivy.com/clients/ 

Read More