Editing of KBART files

Frequent errors

When editing KBART files, some errors occur very frequently and should be avoided.

  • Incorrect character encoding: The character encoding of the KBART file must be UTF-8. Some providers differ from this standard, but the character encoding can also change unintentionally in the course of editing KBART files. Therefore, you should always check it.
  • Reformatting dates: Dates such as publication dates, coverage dates, etc. must be given in the format YYYY-MM-DD. Spreadsheet editors sometimes automatically change this pattern to other formats such as DD.MM.YYYY. This reformatting must be avoided.
  • Reformatting ISBNs: Sometimes the ISBN-13 in particular is present without hyphens. Spreadsheet editors then misinterpret it as a numerical value and display it in exponential notation (e.g. 9.78179E+12). This causes the last digits to be lost and the ISBN becomes unusable.

Microsoft Excel

KBART files cannot be loaded with Excel via the ‘Open’ menu. They must be imported in an empty workbook via the menu item Data > From Text/CSV.

Screenshot of an empty Excel document with the menu item From Text/CSV framed in red

Import settings

An import window will open. In this window it is extremely important to make the following settings correctly:

Screenshot of the file Excel window import settings
  • File origin: Here you can select the character encoding in which the KBART file is available. This is not automatically recognized by Excel. Excel suggests the encoding 1252: Western European (Windows), which is wrong in most cases. According to the NISO standard, the encoding must be 65001: Unicode (UTF-8). Select this encoding from the dropdown. Note: It is possible that providers do not follow the NISO standard and choose a different encoding. In this case, please select the encoding used by the provider. To learn how to identify the encoding, see the section ‘Recognize and correct character encoding’.
  • Delimiter: For KBART files, tab stops are the default delimiters. So select Tab. Usually this selection is already made.
  • Data type Detection: Select ‘Do not detect data types’, because Excel can corrupt the data by an incorrect data type recognition.

Correction of header row

Excel does not recognise the column heading row of the KBART file as headings and adds another row.

Screenshot of an Excel list in which the term header is outlined in red
  • Deselect this line as a heading line. To do this, remove the check mark under Design > Header Row.
  • Now delete the empty row by right-clicking on the row number ‘1’ and selecting Delete from the context menu.

Editing the KBART file

You can now edit the file as you wish. Please note the standards of the KBART format, which are summarised here .

Saving the KBART file

Save the KBART file via File > Save as. After you have selected a storage location, a window appears. Here you have to select the following settings:

  • File type: Text (tab-delimited) (*.txt)
  • Tools > Web Options > Encoding tab > Save document as: Unicode (UTF-8)

You confirm the selection via ‘OK’. The following note that the file type does not support workbooks with multiple sheets, you also confirm with ‘OK’.

Important note: Excel often saves in an incorrect character encoding despite correct procedures. Therefore, check the file for its character encoding as indicated in the section ‘Recognize and correct character encoding ‘.

LibreOffice Calc

LibreOffice Calc is a free spreadsheet editor suitable for editing KBART files. KBART files can be opened simply via File > Open.

Import settings

An import window will open. In this window it is extremely important to make the following settings correctly:

Screenshot of a dialogue window for importing a KBART file
  • Import: Here you can select the Character set in which the KBART file is available. This is not automatically recognised by LibreOffice Calc. LibreOffice Calc suggests the encoding Unicode (UTF-8) which is correct in most cases. According to the NISO standard, the encoding must be Unicode (UTF-8). Note: It is possible that providers do not follow the NISO standard and choose a different encoding. In this case, please select the encoding used by the provider. To learn how to identify the encoding, see the section ‘Recognize and correct character encoding’.
  • Separator Options: For KBART files, tab stops are default separators. Therefore select Tab. Usually this selection is already made.
  • Other Options: Deselect all options here. In particular, the ‘Detect special numbers’ must be switched off, because an incorrect number detection can corrupt the data.

Editing the KBART file

You can now edit the file as you wish. Please note the standards of the KBART format, which are summarised here.

Saving the KBART file

Save the KBART file via File > Save as. Please select the following settings here:

  • File name: Specify the file name with the suffix .txt or .tsv. The GOKB accepts both extension.
  • File type: CSV (*.csv)
  • Automatic file name extension: Remove the check mark from this option.
  • Edit filter settings: Select this option.

After selecting the ‘Save’ button, the program asks you to confirm that the file should be saved in CSV format. Confirm the dialogue.

A new Export text file dialogue then opens. Make the following settings here:

  • Character set: Unicode (UTF-8)
  • Field delimiter: {Tabulator}
  • String delimiter: ” (Superscript double quotes)
  • Save cell content as shown: Select or check the box
  • All other options: Deselect or uncheck

You confirm the selection via ‘OK’.

Recognise and correct character encoding

To recognise and edit the character encoding, the GOKB team recommends the free tool Notepad++. Since KBART files are text files, they can be edited with the text editor Notepad++.

Open the KBART file in Notepad++. Normally Notepad++ will detect the character encoding of the file and display it in the footer of the window. If the encoding has not been correctly detected, you will see incorrect characters displayed with a black background by Notepad++:

  • Punctuation marks like apostrophes
  • Umlauts like ä, ö, ü
  • Other letters and characters outside the standard Latin alphabet (accents etc.)
Screenshot of a KBART file in Notepad format with incorrectly encoded characters marked in red

You can use the Encoding menu to select the correct character encoding for the file so that the characters mentioned are displayed correctly. Using the same menu item Encoding > Convert to UTF-8 you can change the encoding to the correct character encoding UTF-8 recommended by the KBART standard and then save it.

Screenshot of a KBART file in LibreOffice format with the term coding framed in red

In rare cases, a character in a UTF-8 encoded file may not be uniquely assigned. The import of the KBART file will fail with the error message ‘File has illegal charset null!’.

The file can be corrected as follows:

Open the file in the Notepad++ tool and select Convert to UTF-8-BOM (BOM = Byte Order Mark) from the Encoding menu. This allows the system to more reliably assume that a file is encoded in UTF-8. Then save the file. As a precaution, you should validate it again before importing.