From Print to Digital, from Document to Data: Digitalisation at the Publications Office of the European Union

Abstract Since the 1970s, the Publications Office of the European Union, the official publisher of all the institutions and bodies of the EU, has had to adapt to a fast-changing situation as the number of EU Member States has grown and the number and nature of publications has evolved (including publishing public tenders of EU institutions and Member States in 1978 through a supplement to the Official Journal of the European Union and handling CELEX, an interinstitutional and multilingual automated documentation system for community law, in 1992). These changes occurred over several ages of computing. The computerisation of the Publications Office was primarily a response to the need for rationalisation and productivity, but the aim was also to gradually adapt to new types of document publication and consultation. These different stages of digitalisation required the constant transfer of information to a multitude of media. Supports, such as punched cards, optical discs and CD-ROMs, had varying life expectancies and are all evidence of attempts to digitise information before the Web. This evolution not only illustrates the need to constantly harmonise a large amount of information, it also highlights some continuities. It affects the management of information systems but also meets regularly updated standardisation, interoperability and sustainability needs within a complex ecosystem.

In the field of computing and information and communications technologies, 1969 is remembered for two major events, namely the creation of UNIX (Salus, 1994) and the first Arpanet network connections between four centres in the United States (Paloque-Bergès, . However, it also marked the creation of a future actor of the European Union (Prometti, 2000) on the other side of the Atlantic: the Publications Office of the European Union. It was created following a unanimous decision of the institutions of the European Communities on 16 January 1969, and for the last fifty years it has been the official publisher of documents for all EU institutions and bodies including the Parliament, the Council, the Commission, the Court of Justice, the Court of Auditors, the European Economic and Social Committee and the European Investment Bank.
The Publications Office would spend 50 years observing but also participating in the evolution of computerisation, namely the access to computers both in the professional field and in the domestic sphere, as well as in the development of networks and of the Web, initially created in 1989 at the European Organisation for Nuclear Research (Berners-Lee, 2000).
The Office is a prolific publisher that manages constant updates and the requirement of precise legal information. It publishes several tens of thousands of Official Journal (OJ) documents every year, and has since 1978 also been responsible for publishing public tenders for EU institutions and Member States, first in the Official Journal supplement 1 The 1970s: from mechanical methods to the digitisation of production It comes as no surprise that the first digitalisation carried out by the Publications Office concerned their internal procedures. Computing was initially seen as an aid to production and a rationalisation tool.
Even before the beginning of computing as we know it today, punched cards and magnetic strips had already made their appearance at the Publications Office. The real transition to computers occurred in the middle of the 1970s, when it was used to create the tables for the OJ5.
Discussions in 1975 led to the development of a computer system for compiling the annual indexes, based on the data contained in the monthly tables recorded on magnetic tape. One of the first practical applications will be the preparation, in the first weeks of 1977, of the annual index for 1976. With that end in view, the sector has made a start with harmonising the terminology in the six languages to facilitate the work of programming the computer6.
However, the first steps towards digitalisation had already been made in some European institutions as early as the 1960s. This is noted by Hélène Bernet in her testimony on CELEX (Publications Office, 2006). This legal database, which was developed within the Commission and only joined the Publications Office at the beginning of the 1990s, was the result of this early digitalisation movement.
It was initially a story written in the first person singular from 1963, until a team was created in 1967. This activity had not been planned in community statutes or budgets -the first programmers had been recruited for clerical posts, and the computing centre of the European Atomic Energy Community (Euratom) was still called the 'Mechanography Workshop', working on monoprogramming with sequential memories (wide magnetic strips) and punched cards … Updating the files from the Euratom Information and Document Centre took 3 hours and took up everybody's time (Publications Office, 2006: 10). This testimony shows that there was already a small initial movement towards digitalisation in the 1960s (on early digital, see Haigh, 2019), which was often the result of individual initiatives.
The use of magnetic strips was the first step in digitalisation. It would make a lasting mark on the activities of the Publications Office, as the transition to computerised systems would still be in progress ten years later. In 1986, the implementation of a computerised infrastructure to enable the publication of the European Inventory of Existing Chemical Substances required staff to transfer the translations of 100 000 substances in the six different languages of the OJ from magnetic formats to the infrastructure in order to avoid inputting the data a second time.
Although all the technical problems on have been ironed out, and although this method could lead to savings at the inputting level and shorten publication deadlines, it imposes new constraints on authors as far as manuscript quality is concerned. Since these constraints cannot always be adhered to, the economic advantages hoped for are soon cancelled out by the need for subsequent processing7.
The necessary attention to service quality and the time-consuming nature of this transition are typical of the first computing era. This set the challenge of a posteriori digitisation that the Office would also have to take on for the publication of the OJ at the end of the 1990s: it was only at the end of 1999 that the digitisation of the OJ documents published since 1952 was completed and made fully available online to the public.
It is important to note the vigilance of the trade unions. They saw the first steps of digitalisation as a possible source of problems in terms of both training and recruitment, and were concerned about the growing trend to outsource multimedia and digital tasks8 (we will return to this trend later in the article). However, it should be emphasised that computerisation was a gradual process9 that did not encounter any major opposition within the Office.
This period is still essentially that of paper and manual corrections. Presses rolled continuously in the printing office that had existed in the Publications Office at the very beginning: the team was always prepared to step in in the unlikely case of a problem occurring in the central printing office. In the meantime, it printed catalogues of publications and documents, brochures, posters, etc. Rita Sedav, who joined the Office in 1974 with her husband, who was responsible for building maintenance, recalls: At the printing office I liked working on the famous round table where we were about 10 women, picking up the sheets to make booklets,  to make other booklets, then finally put them together. The round table was a special table because it was a place to work, but also to chat […]. The table had a motor underneath -it turned automatically, and when we wanted to stop it moving, we just did it manually10.
The Publications Office of the European Union was built in 1973 in Luxembourg above the cellars of the former Mercier champagne company. This location was ideal due to its proximity to the city's train station, and it even had its own private tunnel to the nearby Post Office railway station. This facilitated the initial shipping of a fast-growing production of paper documents requiring a large physical infrastructure. The distribution centre running from 1990 to 2015 in Gasperich, in the south of the City of Luxembourg, was also very busy with a constant use of assembly machines, packaging paper storage and deliveries of documents using shuttle services sent from Brussels11.
2 The 1980s to the early 1990s: databases, telematics and CD-ROMs as milestones of digital evolution Over the years there was a dramatic improvement in the methods used in the field of computing, in the workplace in general (see Ceruzzi, 1998;Campbell-Kelly and Aspray, 2004;Bösch, 201812) and at the Publications Office in particular. When the IT department took its first steps13 in 1987, there were just 20 terminals connected to a central computer and 20 word-processing stations, and the first Macintosh computer with graphics tools was used to help create models. When Willem van Gemert joined the Office as a Dutch proofreader at the end of 1995, he was surprised by the strong presence of typewriters because he thought European institutions were more advanced in the computing field14. The IT department had, however, changed substantially since its humble beginnings: it had abundant hardware, whether in terms of mainframes and servers or computer terminals. The latter included more than 150 computers using UNIX, mainly for the different editorial or documentary departments, over 500 personal computers for office work (of which at least 450 were permanently connected to the local network) and half a dozen Macintosh computers that were mainly used in the context of graphic design applications15.
The computerisation of the Office had an undeniable impact on production capacities and the daily work of the staff, but it also led to changes in the ability to disseminate information. The Office's motives were both internal and external. It was by no means only driven by technical factors: as mentioned above, computing was initially seen as an aid to production and a rationalisation tool. There were underlying economic factors, too: the costs related to the production, printing, storage and dispatch of paper documents had to be compared with those incurred by computing, although the business models associated with digitisation were not stabilised and did not provide an unequivocal answer. However, over time, thought began to be given to changes in how information was transmitted, the time needed and the accessibility, transparency and up-to-date nature of information. As we will see, this perspective gained in clarity and coherence as the years passed, but the first experiments in digital communications were carried out back 10 Testimony available in the film created by the Office for its 50th anniversary in 2019. This account draws our attention to the gendered nature of tasks in the Office. The contribution of women during this early period merits further research -which we have not as yet been able to carry out -on the transition to computing and its consequences on the employment of women and the gendered distribution of roles in the workplace. This could usefully complement the fascinating work carried out in particular by Abbate in 2012 andHicks in 2017. 11 Interview with Luc Jeangille, 13 February 2019. 12 Regarding our topic, we would especially suggest the reading in Bösch's book of the chapter written by Kim Christian Priemel offering a comparison of the Bundesrepublik with the United States and Great Britain on the theme "Computer und die industriellen Arbeitsbeziehungen in den Druckindustrien" from 1962to 1995(Computers and Industrial Labour Relations in Printing Industries from 1962to 1995. 13 Interview with António Carneiro, Director, Directorate C, Access to and reuse of public information, 31 January 2019. 14 Film created by the Office for its 50th anniversary, 2019. 15 Twenty-seventh annual management report, Office for Official Publications of the European Communities, 1995, pp. 71-72. For more context about the computer services industry and the emergence of companies that offered data processing, programming, systems integration, etc., see (Yost, 2017). in the 1980s, the golden age of telematics. The transition occurred in several stages, namely via telematic processes in the 1980s followed by CD-ROMs. This highlights some of the "missing narratives" in the history of networks that historians are now pointing to as a means of demonstrating that the genealogies and avenues of digitisation were more complex than simply the history of the Internet transition (Campbell-Kelly and Garcia-Swartz, 2013;Driscoll and Paloque-Bergès, 2017). CELEX, the official interinstitutional legal database of the European Union mentioned earlier in this document, became accessible during the golden age of the Minitel via '36 17 CELEX'16.
The different media and databases managed by the Office could now produce a true digital archaeology that enables researchers to analyse not only changes in digital technologies but also their superposition and complementarity.
During the 1980s and 1990s one witnessed the beginning of multimedia diffusion that affected all types of information, and particularly the OJ, which was launched in a CD-ROM version in 199717. Initially, however, the CD-ROM met opposition due to the rejection of the electronic publication as a substitute for the paper publication, as only paper versions had legal value. Additionally, the microfiche was still very popular in 1995: although the number of microfiches essentially depended on the number of publications published on paper, the number of microfiches dispatched for customer orders increased (+ 38.5 %)18. In accordance with the request of the Management Board, the microfiche edition of the OJ was discontinued on 31 December 199919 and was replaced by the CD-ROM edition. In 1996, the prospect of replacing microfiches with electronic solutions had been openly raised. Internal production was already falling because of ageing equipment, which it did not seem worth replacing given the prices proposed by external companies20. In 1999, the decision was taken to suspend outsourced production of microfiches, for which demand was not inexistent but was on the decline, and instead to promote the new CD-ROM. The Office did, however, obtain a licence with a private publisher to publish the OJ on microfiche for users who still preferred to use this method.
This late decision to replace microfiches may be surprising, but the continued use of the CD-ROM until the beginning of the 2000s is even more so, given the widespread use of the Internet and the Web across a vast part of Europe. Here again, the Publications Office chose to continue using the CD-ROM whilst introducing a more recent type of digital media. It released a new version of the OJ S CD-ROM (containing details of public tenders) in 1999; 'this format allows subscribers to enjoy the comfort of the CD-ROM whilst being able to access recent updates and archives contained in the TED (Tenders Electronic Daily) database via Internet21'. Indeed, the 1999 annual management report notes the significant increase in the production of periodical CD-ROMs (such as Eurostat) outside of those produced by the OJ, in the monthly L and C series22, and underlines the attachment of clients to CD-ROMs; this is paradoxical as they preferred to pay for the CD-ROM rather than use the free access available online. As with microfiches beforehand, there was a degree of mistrust in a new solution which some feared would be of lesser quality, especially if it could be accessed free of charge.
The situation of the OJ is, however, paradoxical. Indeed, although the paid CD-ROM version of the OJ S version and the free database TED currently share the same interface, and despite the fact that TED offers a higher diffusion speed (cutting several days off the time needed to respond to a call for tenders), the number of subscriptions to the CD-ROM increased by almost 10 % in 1999. The cost of telecommunications, connection difficulties or the inadequacy of the customers' computer equipment most probably explain this phenomenon, but it is also likely that a discerning and professional clientele prefers to pay for a service, thus allowing them to demand a high level of quality and reliability in return. It therefore seems clear that a high-quality CD-ROM or a feature-rich document database does not necessarily suffer from the competition of a free service23.
It is also important not to underestimate the way in which people develop consultation, research and reading habits when using different media -there is undoubtedly a significant difference when it comes to using microfiches and the Web, for example -, even if, as explained in the following extract, the CD-ROM version and the TED free online database had the same interface. This also should invite us to deepen the comparison between all these supports, formats and content, as Matthew S. Hull had already suggested in "Documents and Bureaucracy" (2012: 261): All that said, the insights we have gained from attention to paper-mediated documents have much to offer to the study of electronic forms. After all, adapting Geertz's aphorism, anthropologists do not study paper villages; they study in them. The relation between electronic forms of communication and studies of paper is not only historiographic, but also historical and theoretical. Historically, new communications technologies have supplemented and transformed, rather than replaced, older ones, and paper documents are no exception (Sellen & Harper, 2002).

The second part of the 1990s: the appearance of the Internet and the Web
Although in the late 1990s users showed a continued preference for the CD-ROM and considered it to be more reliable and of higher quality than the Web and online services, the Publications Office Management Committee nevertheless asked an interinstitutional group dedicated to 'OJ content and structure' to develop modes of publication that would be more in line with the requirements of 'transparency' and 'financial economy', which were to become two key expressions of European policy and two strong motivations for digitalisation.
These aims were in line with those of the Executive Committee, namely to take into account the rapid development of the Internet and its effect on the goals and the structure of the Office. This also involved providing the necessary editorial capacity to produce this type of digital publication for all the institutions: 'Some form of interinstitutional collaboration was needed for Internet use, and the Office was considered to be the best platform to implement these changes24'.
This early mention of the Publications Office's role as a platform is interesting and draws our attention to two related phenomena. The first is the 1996 decision by the Office to work with new 'multimedia' contractors, thus enabling it to meet the institutions' requirements for CD-ROMs and Internet publications after its initially amateurish attempts at the Internet, as described by Alexander von Witzleben: At the beginning of the 1990s and at the end of last century, it was very difficult to find information you were looking for, because publications were only available on paper, and secondly, did you know where to find the paper you needed? […] 'Alexander, please provide us with a website for the Publications Office'. So how do you do this in 1995 if there is no contractor, no one who knew how to do it? I had a stagiaire and he told me that at his university there were some guys dealing with the Internet. So we invited them and they got a bit of pocket money which was paid by our director, and with this we created the first website of the Publications Office 25.
Yet the Office was not only an intermediary, it was also a player in this increased digitalisation for the European institutions, which were starting to take a serious interest in the "Information Society", on the heels of the European Commission (Gibbs, 2001). This meant widening the range of tasks carried out by the Office because the author services of the European institutions and agencies that habitually used the Office for adaptation work on their websites increasingly entrusted it with the entirety of their projects, from conception to the final technical production26. The Office now had to cope with this increased demand from author services that were not always aware of the constraints of production. It had to channel and limit late and urgent requests for changes to content during the procedures from authors who considered digital production to be more flexible and responsive, as well as dealing with the poor quality of some of the electronic documents delivered. The author services sometimes underestimated the efforts involved in digitalisation, which actually requires substantial content adaptation. 24 idem, pp. 5-6. 25 Film created by the Office for its 50th anniversary, 2019. 26 Thirty-first annual management report, Office for Official Publications of the European Communities, 1999, p. 30. In addition to information on existing CD-ROMs and databases, the Europa Internet server has provided access to EUR-OP News since 1995, giving easy access to news on the current policies of all the European institutions by providing an overview of all new publications from the EU. This was an undeniable success, with 26 500 connections to the site in December 1995 alone. However, the online release of Interinstitutional Directory of European Administration data was delayed until the following year due to the need for substantial changes to dissemination formats. Here again success was quickly met with an average of 1 500 connections per month in 1996 for the version available to the general public on the Europa server, while the internal version available on Europaplus was used daily, notably by translation services. The transition from three languages (German, English and French) to eleven was scheduled for the beginning of 199727. At this time, as mentioned above, the European Union underwent a further enlargement (Kaiser & Elvert, 2004), with the accession of Sweden, Finland and Austria in 1995 bringing the number of Member States to 15, resulting in the publication of a special edition of current Community acts in Finnish and Swedish (48 000 pages)28. This required a considerable effort, but it was nothing compared to what would be needed with the enlargement in the early 2000s (in particular the accession of 10 countries in 2004). In the latter half of the 1990s, the EU institutions stepped up their work on "Changing Spaces of Political Communication" (Schlesinger, 2001). The ""period of reflection" (2005)(2006)(2007) that followed the rejection of the Treaty establishing a Constitution for Europe by the French and Dutch citizens" and "spurred the EU institutions into action, with the Commission notably producing Plan D for Democracy, Dialogue and Debate only months after the 'No' vote in France (COM(2005)494, final)" probably also had an impact, as "the EU's public communication, the Commission's new, and widely publicised, strategy has at its heart the strengthening of communicative and collaborative linkages with civil society and the public, in an effort to enhance informed debates on EU issues and widen participation in the consultation stages of decision-making" (Michailidou, 2008).
The widespread use of networking and digital media for information dissemination also led to profound changes in the work of the Office when preparing publications: the white table piled high with papers and the red and green pens of the language editors29 were gradually abandoned for computerised correction processes, and there were changes in the correspondence with the author services and in production tasks. Outside these changes, effects were felt throughout the whole 'life cycle' of a publication from its identification (we will return to this subject) to updates and maintenance. The Office now had to be able to satisfy increasing demands for updates, which required 'service continuity' that had to be taken into account when estimating the workload and the allocation of human resources.
These efforts were productive insofar as they allowed the Office to gradually take over a large part of the production chain. This was noted by Per Hoj, who worked at the Office for 43 years from 1974 to 2017: I think the biggest revolution of all my time at the Office was what we would call the IT revolution. It was moving from typewriters and desktop calculators to computers. When I started we were all sitting at typewriters […], we were using calculators instead of spread sheets and we had no computers, we had no emails, everything was typed out and sent down the corridor to the next person. That was probably the biggest change of all, trying to persuade the Office management and through the Office management persuade the institutions to see dissemination as part of the production chain. Until the early 1990s the Office was only in charge of production, of the publication of the OJ […]30.
The Office also developed expertise in this area that it could lend to the institutions and to departments producing information. In this respect it can be considered as one of the stakeholders that contributed to the process of "building Europe on expertise", to cite the analysis by Kohlrausch and Trischler (2014)31. Amongst other effects, production time was significantly shortened by the optimisation of production at all levels: there was a new production environment supported by semi-automatic correction tools allowing an increase in productivity as well as an increase in the use of telematics infrastructure. This included file-transfer procedures and computer mailboxes by the various actors in the production chain and the development of workflow tools.
27 Twenty-eighth annual management report, Office for Official Publications of the European Communities, 1996, p. 17. 28 Twenty-seventh annual management report, Office for Official Publications of the European Communities, 1995, p. 9. No special edition was required for Austria since its official language is German. 29 Interview with Helder Da Costa de Pina and Anja Damerow, 13 February 2019. 30 Film created by the Office for its 50 th anniversary, 2019. 31 On European experts and expertise, see also (Bouneau and Burigana, 2018).
While the Office worked internally on the integration of digital technologies, in 1999 the management committee also launched a new large-scale project, namely the establishment of an integrated, user-friendly and comprehensive access system for legal texts and the creation of a single portal for access to European Union law.
At the same time an interinstitutional task force recommended the implementation of integrated, coherent and complete access to all electronically available legal documents on the Europa server. CELEX was meant to be the core part of this online service. The approach foresaw action on three levels: to create a coherent data pool and improve the production chain, to eliminate and avoid redundancies and to provide a single entry point. In July 1999 the Management Committee of the Publications Office decided on EUR-Lex, which was launched in April 1998 and offers access to the Official Journal, to become the single gateway that should -by being built around the CELEX database system -allow easy access to all legal information sources (Publications Office, 2006: 31). It was a great success: the number of connections grew continuously, with EUR-Lex attendance indicators for 199932 showing between 8 000 and 9 000 users consulting 70 000 to 130 000 summary pages of the online OJ per day. This was a large audience for a site that had been launched just one year previously and that had provided access to information about EU law, case-law from the Court of Justice and other EU public documents, and that had also provided access to the electronic edition of the OJ, which was free for the first twenty days after its publication.
These changes, and in particular the transition to free admission that started to emerge in the late 1990s for some publications, were sometimes a cause for concern that was echoed in other areas of the publishing sector, particularly in the scientific domain. As pointed out in the 1999 annual management report, free access to a number of databases such as TED and the discontinuation of other paid products such as the paper version of the OJ S resulted in a loss of income in 199933. These losses continued a trend that had begun before 1999 with the drop in the number of paid subscriptions after the introduction of free online products such as EUR-Lex and the European Commission Library Automated System. The free-access policy also led to the gradual abandonment of the intermediary network of companies who sold the EUR-OP databases on CD-ROM or on floppy disk. The accessibility and openness of European information became central imperatives, and free access to EU information continued to be generalised throughout the 2000s.

The 2000s: towards an integrated digital policy
On the 25th anniversary of the online accessibility of European law, Thomas Cranfield, the Director-General of the Publications Office in 2006, noted the following: CELEX -Communitatis Europae Lex -was born out of ideas and initiatives within the Legal Service of the Commission of the European Communities at the end of the 1960s. The initiatives bore fruit in the context of the development of Community law: a human brain could no longer process all of the documents using traditional means and as early as the beginning of the 1970s, a legal Community database had been made available internally within the Legal Service. Such an accomplishment at that time allows us to say today that CELEX is the oldest of legal databases and the oldest still in use. […] The distance covered is long considering that a system, which was born as a tool for professionals, first for a limited internal use and then open to the public on 1 July 1981, is today available to the whole world and free of charge. In this evolution, the access to documents and to legal information of the European Union, which began in 1981 depended on two other events. The first one also goes back to 1981 and is the presentation to the public of the first personal computer. The second is the Internet (Publications Office, 2006). Thomas Cranfield's comments highlight two elements, namely the availability and the free nature of information, which were central to making the OJ and European legislation freely available and which gained importance throughout the 2000s. Let us first consider EUR-Lex. This website is currently the leading online source of EU law (8.5 million documents available and 58 million connections in 2018). It is the most frequently consulted website managed by the Publications Office and indeed of the entire europa.eu domain. EUR-Lex initially provided free online access to the OJ for a period of twenty days after its publication, as previously mentioned. This first step towards free information was followed by completely free access to the OJ via the EUR-Lex site from 1 January 2002 onwards. The impact of the directive on the reuse of public-sector information reinforced and generalised the transition to free services by 2015. This is a clear example of how the law influenced policy-making in the EC/EU, which in turn had a direct impact on the Office's activities: in this respect the Office reflected wider developments in the EU in terms of institutions, information, communication (especially public communication, see Valentini & Nesti, 2010 on its history and challenges) and of course legislation. Although this development led to the disappearance of sales offices for EU documents, which had until then physically represented the Publications Office in the different Member States34, it was a significant additional step towards the availability and reuse of public data, whilst conveying a willingness for transparency and equality. This free access provided equal opportunities for all, regardless of whether users were SMEs (small and medium-sized enterprises) with little income consulting public tender announcements in TED or large groups who were generally subscribers35. The second important event was when the online OJ was deemed authoritative on the 1 July 2013 -the electronic signature allowed the publication to be certified and have legal effect. There was no further need to deposit an official copy of the OJ at the door of the Publications Office every morning; a symbolic page had been turned in the acknowledgement of electronic publishing.
Efforts were made to render these electronic publications more accessible over time. However, in previous years digital media had been juxtaposed and simultaneously used; this was not always done deliberately but was often explained by the integration of external platforms such as CELEX or the Community Research and Development Information Service (CORDIS). The latter is a data bank that was launched in November 1990 to provide the results of European research funding. The CORDIS team joined the Publications Office of the European Union in 2004. CORDIS was initially composed of three databases published on ECHO (European Commission Host Organisation), following a DG XIII36 initiative in 1988. By 1991, CORDIS had grown to six databases, which became accessible on CD-ROM in 1993, followed by a complementary online service a year later37. In 1996, all nine databases and an online service were combined on a single website. This makes CORDIS the oldest website of the European institutions, with 15 00038 connections per month in 1994, climbing to 5 million page visits and 230 000 users per month in 2002.
Faced not only with increasing amounts of information but also with growing needs for services, the Publications Office therefore had to harmonise its sources of information and render the latter more accessible. This was undertaken in the second half of the 1990s. As early as 1996, the Publications Office emphasised the need to organise some form of interinstitutional collaboration on Internet use, and a data exchange format called Eurolook39 was developed to exchange electronic texts between institutions. From the 2000s onwards, the emphasis was put on harmonisation, interoperability and standardisation. This target was underlined when Martine Reicherts became the Director-General of the Publications Office and committed to reducing information silos40 whilst aiming to develop solutions that would be compatible with semantic web development41, signalling an emblematic change from document to data.
The integration of ten new Member States within the European Union in 2004 marked a turning point42. This evidently led to changes in terms of staff and multilingualism through the recruitment of hundreds of additional language editors, but it was also an exciting digital challenge. Staff from the ten new Member States were testing a new experiment with a digital dimension, namely the use of XML for correction work. This mark-up language facilitates the automated exchange of complex content between heterogeneous information systems and standardisation of content and metadata. A 'true start-up spirit'43 appeared as newcomers worked to turn Word documents into XML, and this digital standardisation effort was concomitant to an effort to standardise language. 34 Interview with Monique Dejeans, 13 February 2019. 35 Interview with Aija Bilzena, Head of Sector, Reception, production and dissemination of the TED and EU public procurement unit, 12 February 2019. 36 On the history of DG XIII, see (Jourdain, 1996). 42 On the history of European integration and EU enlargements, see the following: Bitsch, 2004;Dinan, 2004;Dedman, 2009;Hix and Høyland, 2011;Nugent, 2010;Bache et al., 2011. 43 Interview with Eva Beňová, Director, Corporate services, 31 January 2019.
Another effort for the conservation, digitalisation and standardisation of information takes place within 'the Cellar', the Publications Office's common repository of metadata and content. From 2012 onwards, it became the repository for all data and grouped together the databases of the Office. The electronic archive Eudor (McGowan and Phinnemore, 2002) also stores documents from all institutions.
Attempts to reduce information silos, build cohesion in information and permit the compatibility and interoperability of systems involved the application of common standards (Yates and Murphy, 2019;Jancovic et al., 2019). This effort has been particularly visible over recent years, as shown by the following three examples: first, the European Legislation Identifier (ELI), which was a 2017 joint initiative by the Member States and the Publications Office44 that allowed an ontology based on linked open data45; second, recent work on dissemination metadata within the Interinstitutional Metadata Maintenance Committee, which was presided over by the Office and was composed of representatives of all the EU Member States46; and, finally, the older policy of identifiers, which was set out in the early 2000s and concerned the 'international standard serial number' (ISSN), the 'international standard book number' (ISBN) and the 'digital object identifier' (DOI)47. Madeleine Kiss attempted to convince her Office colleagues and the author services of the importance of identifiers, 'because in the Office we mainly knew how to use catalogue numbers, which was quite an ingenuous concept but was not visible on an international scale': Until June 2002, the ISSN International Centre was responsible for assigning ISSN numbers (including the production of the related bibliographic records) to the entire editorial production of continuous publications of the European Union. This task was very difficult, however, as the centre did not always receive the voucher copies required to enable it to fully carry out its responsibilities (most importantly for the production of the bibliographic records); moreover, ISSN requests were often made directly to the various national depository libraries.
However, the situation changed when in 1999 the Publications Office was formally requested to start cataloguing at source all institutional publications using the machine-readable cataloguing format. From that moment on the Publications Office could become an agency for assigning ISSN and ISBN to its editorial production as it was able to produce standardised records and enable access to its production (Kiss, 2015).
Publications had to be corrected, harmonised and standardised not only in terms of descriptions, identifiers and formats, but also for content. This is reflected in the work of the language editors but also in the creation of thesauri for use on the EU vocabularies site, to create tables of authority and concordance to be used as guides for different countries. This work had to keep pace with the daily publications of the OJ or public tenders published in TED (with a volume that sometimes attained 5 000 public tenders per day, cumulating more than 47 million connections per year and an annual income of EUR 460 billion48).
The internal harmonisation effort had to be met by the simultaneous provision of integrated, easy and comprehensive access to information for users through attractive interfaces, and thus the redesigning of websites. This was far from being an easy task. Harmonising the graphic charter was a long-term process.
The site http://www.eur-op.eu.int site has been totally restructured, successfully making use of web publishing techniques. The graphics are now consistent throughout the site, which offers thematic and multilingual navigation. The various language versions making up the site can be loaded as and when they become available without affecting the consistency of language and thematic consultation, as this is generated automatically. These mechanisms allow readers to access documents as and when they are available instead of waiting for the final language to become available before downloading49.

44
The ELI framework includes technical specifications on: (1) web identifiers for legal resources (building on URI templates at European, national and regional levels based on a defined set of components); (2) metadata set specifying how to describe legal information, and its expression in a formal ontology; and (3) recommendations for exchanging legislation in machine-readable formats and integrating metadata into a legislative website -https://publications.europa.eu/fr/web/eu-vocabularies/eli. Changes were made in several areas, ranging from a possibility to display three different language versions in EUR-Lex to the redesigning of the most recent CORDIS website (launched in December 2018) to improve its readability50.
The Publications Office sought to attain greater transparency in all its targets, such as reducing the data deluge, the number of titles and the abundance of data created by versioning, providing more targeted and higher-quality information51, giving answers to requests that range from very simple queries via the EU Whoiswho site to more complex demands, informing the reader as EUR-Lex does in its Summaries of legislation of the European Union, giving access to the consolidated versions of the laws and their legal history, enabling users to carry out a more efficient search52 via the Semantic Web, and of course developing open data accessibility and reuse.
Several complementary approaches were used to attain these goals. The first aim was to deal with an increasing volume of data by trying to prioritise quality over quantity. Indeed, the Office became the leading partner in a consideration of how to rationalise information in all the Commission's services. Second, the Publications Office develops means to help users to navigate between the different versions (which is often the case when creating a law) by proposing consolidated versions of European legislation in EUR-Lex or by linking the data by means of hyperlinks53. This is viewed as a long-term policy, and challenges of sustainability also became evident in the recent appearance of a Preservation and Legal Deposit Unit in the organisation chart of the Office. The unit's missions include the archiving of Europa.eu websites, European institutions and agencies, whose oldest contents date back to 2013 and for which the Historical Archives of Florence had previously developed a pilot project. As well as the regular preservation of website content, great efforts are made when websites are redesigned or information is removed, as seen when the Court of Justice redesigned their website on the closure of the civil service court in 201654.
This valorising of information also concerns the opening up of access to data, which follows the reuse55 policy that had already been initiated by DG Communications Networks, Content and Technology via the EU Open Data Portal56 (which became part of the Office in 2012) and the European Data Portal57, which harvested data from national, regional and specific portals. The Publications Office works to apply this reuse policy, which includes not only the open data found on these two servers but also the reuse of data from the Publications Office's websites, such as CORDIS, EUR-Lex or EU Whoiswho, and the reuse of the Cellar through the use of machine-readable SPARQL or RESTful web services. Although the increase from 12 data suppliers for the EU Open Data Portal in 2012 to nearly 80 today illustrates the increasing awareness of what is at stake, it is still important to stress the value of data, avoid 'data hugging' and reluctance to give access to data, develop interoperability, make datasets available and encourage their reuse58. The datathons organised by the Office over recent years59 have shown the social and economic value of data and the wide range of possible uses for this resource in sectors such as food and energy, or indeed the examination of working conditions in the different Member States of the European Union. In addition, the growing involvement of non-human actors can help to make content more accessible -examples being the implementation of automatic translation in certain web spaces or using the Office's application programme interfaces to access datasets. EUR-Lex and other collections stored in the Cellar provide the Office with a rich source of data to fuel machine learning 60. 50 Interview with Karl Ferrand and Vensti Voikov, CORDIS, 13 February 2019. For the evolution of CORDIS, we refer in particular to the frequent captures by Internet Archive in the Wayback Machine, which testify to regular redesigns and changes from a site that was initially conceived as a point of access for multiple information sources and particularly databases to a device that seeks to be user friendly and much more readable. The first web archive retrieved in Internet Archive collections is from 1 This policy of open data and reuse is a sign of the awareness that information is no longer conveyed by texts alone, but is now also based on visualisation or data. Change has occurred not only in the content but also in the means used to consult the information via mobile phones, telephones and tablets, thus necessitating a design that is compatible with these devices. This entails the creation of new editorial methods, as content cannot simply be duplicated from one media type to another. The section dealing with general publications at the Office published over 3 000 works in 2018 (corresponding to 14 000 titles due to different formats and language versions), and these publications are made in increasingly diverse formats61, ranging from books, posters, short videos and various audio-visual content to computer animations, web series, mobile apps and even virtual reality experiences for Eurostat. All of these innovations redefine publications both at the Office and as a general rule.

Conclusion
Although Tim Berners-Lee, the creator of the World Wide Web, may have argued that 'data is a precious thing and will last longer than the systems themselves' (Berners-Lee, 2006), when it comes to the history of the Publications Office of the European Union this statement should be somewhat attenuated. Indeed, data accessibility and sustainability are dependent on the systems that underpin them. In light of Lisa Gitelman's observation that 'raw data is an oxymoron' (Gitelman, 2013) -in other words that data is always constructed, even in its most trivial appearance -, the Publications Office plays an essential role as a data provider. In this role of mediator and interinstitutional actor, the Office is no longer simply a publisher of documents, but becomes a platform for the circulation, structuring, editorialisation and dissemination of information, and is also viewed as a trusted third party.
The transition from paper to digital and from document to data, and the changes experienced by the publications within the Office make it possible to develop several lines of research to follow up on this first approach. This study reveals that the Publications Office is a player that has evolved from being a traditional publisher to a provider of information management services.
Our approach could be replicated for other actors both within European institutions and in the world of publication in order to better understand the historical evolution of digitalisation in professional, economic and political fields.
Moreover, our initial exploration of the history of computerisation and digitisation at the Office also opens up a number of complementary avenues for research: -By demonstrating institutional and managerial motivations -from the rationalisation of production and the challenge of financial economy to transparency, open access and reuse, and adaptation to changing media -, our narrative has undoubtedly glossed over the many clashes and discussions that may have arisen in house. An additional study is needed to explore the real impact of digitisation on the daily working practices of the Office's employees. This was not possible with the archives and oral interviews used at this stage, even if some do provide us with occasional snippets, such as our interview with Luc Jeangille, who is responsible for the Print and Distribution Unit. He gave an evocative description of the Gasperich distribution centre that closed in 2015. He described it as a 'little industry' that 'managed storage and packaging. We had assembly machines that automatically packaged the OJ in polyethylene film, so it was really like a little Amazon, but in Publications Office style.' -It is clear that this perspective would also strengthen the gendered dimension of the research, an issue that is touched on in our article -particularly in the section on paper and mechanical methods -, but which if further explored could usefully contribute to a social approach to the history of the Office and computing in general. -The choices made in this article did not allow us to fully develop the potential of Web archives (Brügger, 2018;Musiani et al., 2019) for a study of this nature. We hinted at it, particularly with the case of CORDIS, but it merits a focused analysis of the various websites used by the Office over its history. This would also give us a better understanding of users -or at least the Office's perception of users. When it comes to actual uses and users of the Office's information -of which our paper offers an implicit typology (from the EU institutions to the general public) that broadened over time, reflecting political developments at EU level and the EU's changing information and communication policy -, they are also potentially accessible as a further means of elucidating the Office's history. This is demonstrated by our allusions to key accounts and subscribers to databases and their adoption of microfiches and CD-ROMs, as well as more recent cases of datathon participants. This is a third open area of study that is only touched on by the article.
This study also confirms the importance of basing analyses not only on innovation, but also on the appropriation of technical systems, their uses and their maintenance. The historical reflection movement initiated in particular by Andrew Russell and Lee Vinsel (2016) about maintainers provided helpful information on new actors. This is also true for the research about maintainers by David Pontille and Jérôme Denis (2019). Their work also leads us to reflect on open data policies, their 'rawification' and their public availability (Denis and Goëta, 2017).
In line with the research of Delphine Gardey (2015) and Jonathan Chibois (2019) on the technical and communication infrastructures within state organisations and the complex links between administrative and institutional agents and technical artefacts, we can hope for a more advanced approach to digitalisation within the European institutions by observing the interactions between personnel, organisational cultures and technologies. Here, we are in complete agreement with Nathan Ensmenger, who suggested to computer historians that: 'One way that we could make our subject more engaging and relevant to others is to focus on people rather than on machines. One of the most significant insights of recent scholarship in the history of technology has been the realisation that technological change is as much driven by social processes as by inherent technological imperatives' (Ensmenger, 2004).