The citation tree finally reached an impasse when the sheer volume of text overloaded Word’s spelling and grammar checking capacity. Because the OCR scanning method I was using to extract citations required frequent (and occasionally extensive) proofing, what might otherwise be seen as a minor bug actually signaled the systems failure. Without automatic proofing tools, the small margin of machine error introduced by OCR exceeded my ability to correct it. Pursuing the project meant severing the branches of the tree into separate documents (which would eventually need to be severed again as the amount of information grew) or working through all of the errors manually (which would virtually cancel out the amount of time saved by copy and pasting citations from OCRed text rather than typing them out myself).
This failure of the citation tree illuminated a symmetrical failure in the file tree. Both were unable to create thematic categories that were not exclusive – categories that would allow for kinds of growth and connectivity that were not strictly vertical. I could only manage redundancy by allowing each reference or citation to appear once within their respective trees but, despite my attempts to maintain thematic clarity across the various branches, they refused to grow outward without also growing backward and inward upon each other in gnarled trans-thematic clusters.
The real question, I realized, was: how can I remember anything if I can only keep it in one place? Counterintuitive as it may sound, it is inherent in almost every mnemotechnical failure thus far. Limiting the number of locations in which texts are stored can be helpful when the mnemotechnology is written, printed or neuronal but, even then, any transfer of information between media must be seen as an act of mnemonic doubling. Although the number of locations that we can manage in the wetware of our minds is relatively low, it is difficult to deny that, within certain limits, we remember things better when they are linked to a variety of places. Much of our memory remains accessible through the iterated pathways between mnemonic places and, thus, it’s important not to think of “place” in an overly literal sense. After a certain point it is no longer productive to regard the process of storing and recalling information in the same way we regard the task of locating a printed book in physical space. The influence of physical space on our memory is considerable, but not absolute; a well-designed digital interface has the potential to radically reconfigure the bookshelves out of which it evolved.
If there is not a finite limit to the number of mnemonic places we can meaningfully query with the assistance of our technology, then the limit must lie in the efficiency and speed with which we navigate these places. Workflow delimits workspace even though the (non-finite) limits imposed by habit often suggest otherwise. This is to say that the space in which we work is ultimately as vast as the space in which we can imagine ourselves working – dependent on how far we can stretch the metaphor of space beyond the physical world through which we came to know it.
Once I grasped how both trees were blighted from the start because their media prevented them from growing together – I began looking for a program that might be able to store both references and the citations within them in a variety of locations without creating the redundancy and inefficiency I had encountered thus far. What I was looking for (though I did not realize it at the time) was a database or, better still, a knowledgebase.
OneNote was a promising option in that it allowed me to divide various branches of the citation tree into notebooks, sections and pages – separate documents on my hard drive that could be aggregated together without overloading the proofing tools. The cloud storage capacities of OneNote and its compatibility with Word were obviously appealing in their own right, but I was still hesitant to embrace OneNote because it seemed to sacrifice much of the layout design and prepublication capabilities of Word . Most importantly, however, OneNote lacks the ability to create and manage metadata that we find in a program like Evernote. But Evernote, too, has some rather damning limitations. It’s not quite a reference manager or a word processor so, when it comes to preparing documents for publication or print, we still end up having to transfer everything into a word processor. It does extend metadata to discrete pieces of text, but this is limited by poor PDF integration. Like many other citation and annotation programs, Evernote still allows citations to fall out of sync with their parent document. The hassle of manually maintaining the connecting between the two, in my experience at least, promotes a superficial use of metadata.
I also tried out some of the more popular reference managers like Mendeley and Zotero, which allowed me to manage categories with multiple instances of the same work and offered the ability to automatically generate citations for each text (untangling the file tree and streamlining it with my word processor). I could even attach PDF files to the references in order to view and annotate them natively. I thought, at first, that I had found the workflow I had failed to achieve with either intratextual PDF annotation or the extratextual citation tree. The problem was that the metadata for both of these programs, even with the relevant updates and plugins, did not reach deeply enough into the discrete citations.
The problem of concentric citations returns here in a new shape. Before, I asserted the importance of delineating and annotating phrase/sentence-level citations from the paragraph/page-level citations that contain them. Here, we can see how much of the software fails to even distinguish discrete citations from entire works. If we want our technology to help us thematize more effectively, we need a program built to handle these concentricities. Without such a program, even the most robust and interactive document cloud will only exacerbate our worst thematic tendencies – reifying them in an infrastructure that only enables us to tag each book by its cover.
While some combination of either of the aforementioned programs would have resolved some of the problems I was facing, I was generally unimpressed with the way they interfaced with PDF texts. I wanted more reciprocity between the intratextual PDF notes and the external tree outline but was struggling to find a program that could even link citations directly to a specific PDF page. I contacted Adobe about enhancing the annotation capacities of Acrobat and was quickly put in touch with Walter Chang, a principal scientist and member of their natural language and text analytics group, who explained how several of my interests overlapped with those being explored at Adobe. Adobe was interested in mining semantic content on their document cloud, whereas I was trying to find the tools for a more modular classification system for personal use. I am grateful to Walter for being the first to encourage me to consider how these improvements might also fulfill the more universal needs of the information technology industry: the possible synergy between human and machine markup and its implications for linked open data, the semantic web and machine learning that I will discuss later. Walter also advised me to seek out existing models for the kind of enhancements I was proposing. It was in doing so that I came across a beta version of Citavi 5, a program that was quite popular in European academic circles but relatively obscure in the US at the time.
Having used Citavi 5 for over a year I can say with some confidence that it exceeds the programs above because it is, at once, a database for references and the citations (or “knowledge items”) within them. What’s more: the link between “reference” and “knowledge” is preserved on the most fundamental level because each citation is directly anchored to specific lines of specific PDF pages. Citavi has enabled me to remember more by storing more pieces of information in more locations and, thus, to thematize more exactingly within and between works.
The shifts from written annotation of printed text to intratextual annotation of digital text to extratextual annotation of digital text that I have described above each took an exorbitant amount of time because they required a total organizational overhaul within and between documents. With Citavi, however, these large scale reconfigurations can be made precisely and rapidly because its advanced search and batch processing tools make it possible to modify the content or metadata for selections that are as vast as they are specific. Its purpose (qua knowledgebase) is the organization of texts rather than organization within texts.
Before I say any more about the various categories and keywords I’ve created in Citavi, I should clarify the extent to which they participate in the prevailing thematic practice of the university and the extent to which they might resist it. I freely admit that many of my primary categories work on a level that is more or less equivalent to the familiar and problematic genres we find in the humanities. They are macrothematic: generalizing to the level of the author/work. The various subcategories and subthemes beneath them, however, grow increasingly microthematic: referring to specific citations within the works themselves. Keywords, as I’ve been using them thus far, represent the most microthematic layer of metadata since they help tag the content within each citation.
While I feel that microthematic tagging has the most potential to alter our thematic practices overall, this does not mean that macrothematic categories do not also have some transformative power or that they simply reproduce established genres (as if genre were something transparent and fixed that could easily be represented in hierarchical form). Many of the macrothematic categories I describe here are valuable insofar as they are personalized. In order for themes to contribute something meaningful to the knowledgebase they need to be personalized in such a way that they
- convey the maximum amount of information with the minimum amount of redundancy
- avoid placing too few texts in too many categories or too many into too few
- aggregate less familiar sources more extensively than familiar sources
The result will always be something of a hodgepodge, but one that juxtaposes texts in a way that facilitates the kind of comparative analysis required to delineate knowledge along increasingly nuanced thematic lines. However fraught they may be, we might learn a great deal about the inherent structure of our minds by watching the asymmetry of such themes in motion within a knowledgebase.
You probably will not learn anything profound by looking at the macrothematic levels of my knowledgebase, but you will see the various proclivities and inconsistencies of my knowledge represented far more transparently and concisely than would have been possible with any previous mnemotechnology. At the macrothematic level, the value of themes lies more in the ideology they expose than the ‘truth’ they reveal. A knowledgebase like Citavi, if used collectively, might enable us to see that our teachers, colleagues and students are not necessarily or exclusively the thinkers they appear to be within the context of a lecture, conference or final paper. This is not to say that we should all grant each other access to the deeper recesses of our personal knowledgebases or that, if we did, we would necessarily find some more complex and authentic mind hidden behind the contingencies – not even that we can think anything at all without some degree of contingency – it is only to suggest that whatever contingencies might pervade our organizational structures could be rendered more visible by this kind of mnemotechnology than by any that has come before. Perhaps, if academic institutions were charged with the maddening task of maintaining clarity and consensus on this macrothematic level within a collective knowledgebase, we might all be better equipped to distinguish more exigent classification problems from contingent pseudoproblems (i.e. problems that are exposed as artifacts of a thematic machinery incapable of administering a good self-diagnostic).
Let us turn now to the revised categorical system that evolved out of the citation and file trees. Rather than restricting all of my texts and citations to one branch of a unified hierarchy I now sort them into three or more positions across three branches.
The ability to manage multiple instances in multiple hierarchies means that texts and citations no longer need to be subordinated primarily to themes simply because the medium can only sustain one text per branch. I can preserve the author-work-theme hierarchy of the citation tree alongside any other thematic categories I might devise. PDFs with multiple authors can be assigned to multiple branches and individual citations by each author can be distributed accordingly. After reading and extracting citations from a work, I now review and tag them en masse, creating sub-themes specific to the sampling. If the material is fairly focused, I might use the individual chapter headings. If its subject matter is wide-ranging (or my purposes for using it are varied) I might create these subthemes myself. I typically do this after I have finished reading the work in its entirety because the process of reviewing and categorizing each citation after reading has proven to be a particularly stimulating mnemonic exercise. I know that the passages I have marked will eventually be revisited when I have more context, so the first read through functions more as a survey of the terrain than anything else (a strategy which has proven particularly effective for systematic treatises and encyclopedic novels).
The ‘project‘ branch allows me to sort texts for more immediate and occasional ends without disrupting the fundamental organizational structure of the other branches or depleting them of content. Essays and classes that adhere closely to a particular author or theme could probably be managed fairly easily within the ‘author’ branch, but I’ve found that organizing a narrower sampling of texts and citations according to the specific needs of a project is especially useful in instances when the essay I’m writing or the class I’m taking/teaching spans several authors, time periods or themes. Without their own dedicated branch these projects would muddle many of the distinctions between various works and citations that I would otherwise like to maintain. If I happen to form categories that have a greater relevance beyond the context of the project, I can easily graft them back onto the author branch. I currently have separate project branches for all of the classes I have taught so far, qualifying exams, the backlog remaining from the citation tree, extra-curricular reading groups, personal reading lists and even this very essay.
The ‘project’ branch also leaves room for me to experiment with different strategies of organization in a non-destructive manner. One example of this is when I wanted to approach Cormac McCarthy’s Blood Meridian from a comparative standpoint – focusing on the conventions of the Western genre for a composition class that I was teaching. This was quite different from how I might view it in my own writing and research. Citavi not only enabled me to create two distinct thematic hierarchies an overlapping selection of citations, it also allowed be to weave together quotations from the novels and films with screenshots from the latter in the hierarchy devoted to the class. Being able to juxtapose speech, text and image made it much easier to explore elements of mise-en-scène that were difficult to efficiently cite alongside the primary texts or film scripts.
The theme branch most closely resembles the folder tree I originally created to accommodate the influx of downloaded PDFs. Unsurprisingly, it is the most fraught of the three branches. Unlike the others, I do not incorporate individual citations here – only works themselves. It is purely macrothematic.
If I had to say whether the works of thinkers like Lucretius or Walter Benjamin belonged to “literature” or “philosophy,” I would have to concede that the answer is problematic enough to justify assigning them to both – not because it is impossible to make this distinction in every case, but because the act of actually going through each case (compiled as they are within anthologies and collected works) is simply not worth the time. It’s also easier for me to create a ‘modernity / postmodernity’ category than to try to differentiate modernism from postmodernism in strictly historical terms (especially since I have already entered the original publication dates for each reference and can quickly sort texts chronologically within and across categories) or to distinguish the literature of these genres from the criticism thereupon (since so many modernist and postmodernist works are regarded as such insofar as they blur the line between original production and critical reception).
The asymmetrical relationship between “theory,” “criticism” and “critical theory” shows how counterintuitive thematic denomination and correlation can be. Criticism proves to be a critical category as de Man (and Mallarmé) once observed. Indeed, it is in ‘criticism’ – the deceptively inconspicuous vestige of a former system – that de Man would read the ironic allegory of the system’s undoing. The problems it poses within my knowledgebase reflect those that concern our institutional infrastructure more broadly. Even after the subfolders of the original ‘assorted’ folder are reborn as themes, ‘criticism’ remains intransigent as if undead – a revenant.
When, after all, does a work become ‘critical’? For many, the term ‘criticism’ might stand for any number of hermeneutic approaches, but I have already assimilated these to more specific themes. After all of the critical modalities are spoken for and all of the assorted texts get sorted, is there still a place for criticism? Is its primary mnemotechnical function to catch the spillover from other categories? Can it outlive its usefulness? Should this be seen as a problem or a solution?
Eventually, I decided to use it more as a keyword than anything else – something to filter works by an author from works about an author. This at least enables me to add secondary texts to the ‘author’ categories because I know that I can use the ‘criticism’ tag to distinguish them from the primary texts. After all, ‘criticism’ also tends to suggest the subordinate relation of ‘critical’ to ‘primary’ texts. While this is the most pragmatic use of ‘criticism’ I have come up with so far, it is still problematic. Do critical works involve sustained attention to specific works or might they be directed towards more abstract areas of interest (e.g. cultural studies)? How sustained? How specific? How general can the critical object become before it nullifies any pretense of critical evaluation? These questions might be answered provisionally for many texts, but are especially difficult when it comes to those more parasitic, deconstructive works that threaten to usurp their hosts. The works of Derrida, de Man, Benjamin, Deleuze, etc., put criticism in crisis on an informational level since designating them as both ‘criticism’ and ‘theory,’ ‘philosophy’ or ‘literature’ risks undermining the sorting function of ‘criticism’ qua tag. I’m certain that all of the aforementioned authors would be tickled by this; after all, these were the categories they strove so valiantly to dismantle. But is there really no way to preserve the distinction between the text doing the reading and the text being read (even if this requires mapping these shifting relations more precisely than ever before)?
I have considered converting the category of criticism into an actual keyword (rather than a category that functions like a keyword), but soon saw that this would be little more than a half-measure. Really the only functional difference between categories and keywords is that categories allow for subordination while keywords do not – so it’s not like anything would be gained by doing so. There’s also the argument that this additional layer of metadata should not be wasted on the works themselves, but reserved exclusively for the discrete citations within them. While neither categories nor keywords will fully resolve the problem, I still prefer to see this anti-category of ‘criticism’ as a placeholder for another kind of metadata entirely: the bidirectional (one-to-many) link proposed by Vannevar Bush and elaborated by Ted Nelson. This kind of link would enable us to join any citation to any other instance of that citation in such a way that all texts would be tied to each of their descendants and antecedents in a vast web encompassing the remotest references and the farthest reaches of recorded history. But for this to happen we would at least need the kind of infrastructure that a program like Citavi might, eventually, sustain. For the time being then, criticism maintains its odd spectral role within the knowledgebase – a kind of negative categorical function that can no longer be what it was and cannot yet be what it might.
I’ve acknowledged numerous times that much of this categorical structuring is traditional (i.e. macrothematic) in its attention to entire works. The real beauty of Citavi is its ability to extend metadata to the microthematic level of the citations themselves. So far, the work-specific themes in the author branch of the database have pushed furthest in this direction but there are still limits to the precision of categories. In order to further anatomize each citation, it is helpful to discard the hierarchical structure of categories altogether in favor of keyword tags. While categories can be made to function like keywords (e.g. ‘criticism’), this blurs the line between the two kinds of metadata and eventually prevents them from functioning optimally. Categories are best suited for macrothematic grouping and the subordination of multiple pieces of information. Keywords, by contrast, are most adept at enumerating the properties of a particular knowledge item in a list-like fashion.
The difference between categories and keywords is much less pronounced with a sentence-length citation than it is with one that spans several pages. The usefulness of keywords increases with the length of the citation. As I discovered during my qualifying exams, longer citations, while they provide more context for a wider variety of occasions, risk generalizing the meaning of the categories under which they are grouped. This is why I tried to preserve some of the sentence-level emphases in the citation tree using boldface text (as I continue to do in Citavi). Highlighting, underlining, circling, boldfacing and marginal annotation can all be seen as more primitive forms of microthematic tagging. Their greatest advantage is their immanent visibility, but this is limited to the outermost layer of the interface. Regardless of whether the interface is a page or a screen, there are only so many visual marks one can add to a passage without muddling the distinctions entirely. Both of my previous attempts at intratextual and extratextual annotation were limited in their microthematic potential because of the visual and spatial economy of the interface. Their most visible layer still needed to remain brief to reduce the likelihood of it being skimmed or skipped.
Perennially inundated with information, our minds are conditioned to pay more heed to macrothematic generalities than microthematic nuances. The potential of keywords lies in their ability to register this nuance without necessarily relying on the outermost layer of the interface. Obviously they must become visible in order for us to read them, but they do not compete with the text of the citation for graphical real-estate. They form an invisible layer that can be searched, sorted and displayed in any number of ways. This means that we no longer need to waste time and space inscribing or digging up microthematic content. In Citavi, at least, deep metadata can be layered beneath the facsimile of the printed page in a way that allows for microthematic annotation to become increasingly visible and central in years to come.
I must admit that when I first began experimenting with keyword tagging I did not fully appreciate its usefulness. Unlike the categorical structure I had already developed in the citation tree, I was building this keyword lexicon from scratch. As I continued to tag a variety of passages from different authors and genres, however, I found myself refining it along similar lines as the themes branch. In order to avoid redundancy, I began aggregating synonymous (and often antonymous) keywords into clusters. Eventually I was able to tag each new passage closer to the rate at which I read it (thanks largely to the autocomplete functionality of Citavi which only required me to type the first few letters of each keyword or cluster). The increased fluidity of this workflow promoted deeper and more extensive tagging. More significantly, the repetitive act of reading while tagging eventually embedded the keywords in my mind in a way that, I believe, has fundamentally altered the way that I read. This was at least as powerful as the technique of generating text-specific themes from general samplings of citations and, uncoincidentally, the two procedures have become almost inextricable in my current workflow; exploring the breadth of themes present within a sampling of citations with keywords reveals the coarser-grained themes best suited for categories.
It’s quite difficult to describe the sensation I had after doing this kind of tagging for any length of time. Obviously, it’s easy to work ourselves into a state performing any kind of scholarly or repetitive task for hours on end. I had certainly experienced more than enough of this while building the citation tree. But I remember noticing a distinctly different sensation once I incorporated keywords into my workflow – as if my mind were being siphoned out into some new lateral dimension. Perhaps because, cognitively, this extreme form of technologically assisted, microthematic tagging is at the farthest remove from the more dominant mode of macrothematic reading that seeks to cherry-pick passages for the immediate task at hand. The latter approach lacks the random access memory to even imagine reference-specific categories, let alone an entire lexicon of keywords. With the prosthetic memory of a knowledgebase, however, the stress of retention diminishes, leaving the mind free to explore less hierarchical forms of linkage.
I’ve even found that this kind of deep tagging is a particularly effective way to bootstrap myself out of writer’s block. Often, sitting down to write this dissertation, I am so overwhelmed by all of the possible sequences and combinations of ideas that I find myself utterly nonplused. But even taking a half hour to tag citations for some relevant text can restore my clarity of purpose and writerly inertia. Perhaps this is because our ability to subordinate ideas in order to construct a narrative suffers when this perceived need for sequentiality begins to restrict our ability to explore the breadth of all possible microthematic associations. This is to suggest that the vertical, linear, hierarchical and sequential dimensions of our imagination might actually be primed by this more lateral, ad hoc, nebulous kind of thinking rather than distracted by it.
Rather than trying to come up with perfectly phrased categories or keywords, I create groups that juxtapose closely related elements the most relevant of which, in the context of the passage, is easy to infer. I’ve found that the ‘/’ works particularly well as a rough and ready symbol of thematic juxtaposition. But clustering is another partial solution to the problem of marking reciprocal relations between increasingly specific levels of metadata. Like the hybrid, tag-like function of the ‘criticism’ category, it too must be seen as a placeholder for a more advanced form of metadata – for the kind of intertextual links that might directly join works and citations without relying on categories or keywords as intermediaries.
What if there were a way to cluster keywords that also granted us the capacity to apply one or more of the clustered keywords to concentric selections within a general citation? This would offer an interesting alternative to hierarchical categorization insofar as the selection of a specific term from a cluster of related terms would not necessarily subordinate one term to the other. Rather than hierarchy, we might end up with something more like density – something capable of registering linguistic intensity within a web or cloudlike structure.
It is true that metaphors of knowledge as a web and cloud have already been reified by our search engines and social media platforms. But rather than seeing this reification as a sign of alienation, many scholars have argued that it has “democratized” the formerly hierarchical structure of information. Might keyword density and intensity eventually replace thematic hierarchies in academia as well? Would webs and clouds model the mnemotechnology of our minds more accurately than vertical axes of power relations? Would this kind of thinking alienate us further from our work or would it make collaboration between humans and machines less alienating? For now, the keywords work more or less like hashtags, but I continue to cluster them hoping that they might one day evolve into a form of metadata capable of describing linguistic information more precisely, in our own words, but in a manner that is endemic to the semantic web and not alienated from it.
The ways in which Citavi has enabled me to hone in on the limitations and possibilities of microthematic tagging by observing the friction between categories and keywords is what really opens it to the advent of a digital future in the humanities. It makes it possible for all of us to intervene materially in the mnemotechnology of thematization. In speculating about the ways in which keyword clustering might facilitate a more social text, however, I am already drifting beyond what is currently achievable with Citavi or any other scholarly mnemotechnology.