Identification and Retrieval of Personal Records from a Statistical Data Bank

J. Schlörer

doi:10.1055/s-0038-1635690

Subscribe to RSS

Please copy the URL and add it into your RSS Feed Reader.

https://www.thieme-connect.de/rss/thieme/en/10.1055-s-00035037.xml

Download PDF

Methods Inf Med 1975; 14(01): 7-13
DOI: 10.1055/s-0038-1635690

paper

Schattauer GmbH

Identification and Retrieval of Personal Records from a Statistical Data Bank

Herausfinden von Dossiers aus Statistischen Datenbanken

Authors

J. Schlörer

¹Department of Medical Statistics, Documentation and Data Processing, University of Ulm

Further Information

Publication History

Publication Date:
14 February 2018 (online)

Permissions and Reprints

From a statistical data bank containing anonymous but individual records which may be evaluated by dialogue—e.g., by entering logical conditions which the system answers by returning absolute frequencies—personal records may often be retrieved. First a user has to identify the record in question, using previous knowledge. An experiment with authentic data is described providing some information on the amount of preknowledge needed for identification. Once identified, parts or the whole of a record may be retrieved by various techniques. The efficiency of a technique already known is studied. A new technique permitting retrieval of complete personal records is described, which is resistant to conventional output restrictions. It becomes feasible as soon as AND plus at least one of the logical operations NOT and OR are available. Some possible countermeasures are discussed.

Enthalten statistische Datenbanken zwar anonyme, aber nach Einzelpersonen gespeicherte Daten, und erlauben sie eine Dialogauswertung — etwa Eingabe logischer Bedingungen, die mit der Arisgabe absoluter Häufigkeiten beantwortet werden —, dann ist oft das Herausziehen personenbezogener Information möglich. Der Benutzer muß zunächst mit Hilfe von Vorwissen den Datensatz der gesuchten Person identifizieren. Dazu wird ein Versuch mit authentischen Daten beschrieben, der Anhaltspunkte für den zur Identifikation nötigen Umfang an Vorwissen liefert. Nach gelungener Identifikation kann der Inhalt des identifizierten Datensatzes mittels verschiedener Techniken teilweise oder ganz festgestellt werden. Die Leistungsfähigkeit einer bereits bekannten Technik wird näher untersucht. Ein neues Verfahren zum lückenlosen Herausziehen identifizierter Datensätze wird beschrieben; das gegen herkömmliche Ausgabesperren unempfindlich ist. Es wird durchführbar, sobald neben logischem UND wenigstens eine der Operationen NICHT oder ODER verfügbar ist. Einige mögliche Schutzmaßnahmen werden diskutiert.

Keywords

Security - Statistical Data Banks - Evaluation by Dialogue - Confidentiality

Schlüsselwörter

Datensicherung - statistische Datenbanken - Dialogauswertung - Vertraulichkeit

References
1 Anonym. Modell einer allgemeinen Vorsorgeuntersuchung — Zwischenbericht. W. E. Weinmann Druckerei GmbH, Bonlanden bei Stuttgart; 1970

Download RIS citation
2 Astin A. W, Boruch R. F. A »link« system for assuring confidentiality of research data in longitudinal studies. Amer. Educat. Res. J 7 1970; 615-624.

Search in Google Scholar
Download RIS citation
3 Baran P. Statement. U.S. Congress, House, Committee on Government Operations, Special Subcommittee on Invasion of Privacy: The computer and invasion of privacy. 119-135. U.S. Government Printing Office; Washington, D.C.: 1966

Search in Google Scholar
Download RIS citation
4 Boruch R. F. Relation among statistical methods for assuring confidentiality of social research data. Social Sci. Res 1 1972; 403-414.

Crossref Search in Google Scholar
Download RIS citation
5 Boruch R. F. Strategies for eliciting and merging confidential social research data. Policy Sci 3 1972; 275-297.

Crossref Search in Google Scholar
Download RIS citation
6 Boruch R. F, Endruweit G. Mathematische Methoden zur Sicherung der Vertraulichkeit und Anonymität von Forschungsdaten. Z. Soziol 2 1973; 227-238.

Search in Google Scholar
Download RIS citation
7 Eimeren W. van, Selbmann H. K, Überla K. Modell einer allgemeinen Vorsorgeuntersuchung im Jahre 1969/70 — Schlußbericht. W. E. Weinmann Druckerei GmbH; Bonlanden bei Stuttgart: 1972

Download RIS citation
8 Feige E. L, Watts H. W. An investigation of the consequences of partial aggregation of micro-economic data. Econo-metrica 2 1972; 343-360.

Search in Google Scholar
Download RIS citation
9 Fellegi I. P. On the question of statistical confidentiality. J. Amer. statist. Ass 67 1972; 7-18.

Crossref Search in Google Scholar
Download RIS citation
10 Folsom R. E, Greenberg B. G, Horvitz D. G, Abernathy J. R. The two alternate questions randomized response model for human surveys. J. Amer. statist. Ass 68 1973; 525-530.

Crossref Search in Google Scholar
Download RIS citation
11 Hoffman L, Miller W. F. Getting a personal dossier from a statistical data bank. Datamation. 16 (05) 1970; 74-75.

PubMed Search in Google Scholar
Download RIS citation
12 Müller P. J. Datenschutz und Sicherung der Individualdaten der empirischen Sozialforschung. Datenverarbeitung in Steuer, Wirtschaft und Recht 3 1974; 2-11.

Search in Google Scholar
Download RIS citation
13 Palme J. Software security. Datamation 20 (01) 1974; 51-55.

Search in Google Scholar
Download RIS citation
14 Schlörer J. Schnüffeltechniken und Schutzmaßnahmen bei statistischen Datenbank-Informationssystemen mit Dialogauswertung. Materialien Nr. 29 der Abteilung fiir Med. Statistik, Dokumentation und Datenverarbeitung; Ulm: 1974

Download RIS citation
15 Selbmann H. K. Ein Datenbanksystem zur Auswertung statistischer Datenbestände. Materialien Nr. 15 der Abteilung für Med. Statistik, Dokumentation und Datenverarbeitung; Ulm: 1972

Download RIS citation
16 Selbmann H. K. Bitstring processing for statistical evaluation of large volumes of medical data. Meth. Inform. Med 13 1974; 61-64.

PubMed Search in Google Scholar
Download RIS citation
17 Sobel M. Group testing to classify efficiently all units in a binomial sample. Machol R. E. Information and decision processes. 127-161. McGraw-Hill; New York: 1960

Search in Google Scholar
Download RIS citation
18 Warner S. L. Randomized response: a survey technique for eliminating evasive answer bias. J. Amer. statist. Ass 60 1965; 63-69.

Crossref Search in Google Scholar
Download RIS citation

Related Journals

Subscribe to RSS

Share / Bookmark

Identification and Retrieval of Personal Records from a Statistical Data Bank

Authors

Publication History

Keywords

Schlüsselwörter

References