Privacy Preserving Back-Propagation Neural Network Learning Over Arbitrarily Partitioned Data

Privacy Preserving Back-Propagation Neural Network Learning Over Arbitrarily Partitioned Data

Privacy Preserving Back-Propagation Neural Network Learning Over Arbitrarily Partitioned Data

Abstract:

Neural Networks have been an active research area for decades. However, privacy bothers many when the training dataset for the neural networks is distributed between two parties, which are quite common nowadays. Existing cryptographic approaches such as secure scalar product protocol provide a secure way for neural network learning when the training dataset is vertically partitioned. In this paper we present a privacy preserving algorithm for the neural network learning when the dataset is arbitrarily partitioned between the two parties. We show that our algorithm is very secure and leaks no knowledge (except the final data’s learned by both parties) about other party’s data. We demonstrate the efficiency of our algorithm by experiments on real world data.

Architecture:

Neural Network:

Existing System:

Existing approaches such as not secure scalar product protocol provide a secure way for neural network learning when the training dataset is partitioned.

Disadvantages:

To the best of our knowledge the problem of privacy preserving neural network learning over arbitrarily partitioned data has not been solved.

Proposed System:

In this paper we propose a privacy preserving algorithm for back-propagation neural network learning when the data is arbitrarily partitioned. Our contributions can be summarized as follows.

(1) To the best of our knowledge we are the first to propose privacy preserving for the neural networks when the data is arbitrarily partitioned.

(2) It is quite efficient in terms of computational and communication overheads.

(3) In terms of privacy, leaks no knowledge about other’s party data except the final data’s.

Advantages:

To the best of our knowledge the problem of privacy preserving neural network learning over arbitrarily partitioned data has been solved.

Algorithm:

Privacy preserving Algorithm:

It is highly important that not only the data but the in- termediate data’s also should not be revealed to the other party because intermediate data’s contain partial knowledge about the data. We propose an algorithm in which both parties modify the data’s and hold random shares of the data’s during the training. Both the parties use the secure 2-party computation.

Modules:

  1. Arbitrary Partitioned Data
  2. Homomorphic Encryption
  3. Privacy Preserving Learning
  1. Arbitrary Partitioned Data:

We consider arbitrary par- titioning of data between two parties in this paper. In arbitrary partitioning of data between two parties, there is no specific order of how the data is divided between two parties. Combined data of two parties can be seen as a database.

When the training data for the neural networks is arbitrarily partitioned between two parties, both parties want to train the network but at the same time they do not want that the other party should learn anything about its data except the final data’s learned by the network. So we propose a privacy preserving back-propagation neural network learning algorithm for the arbitrarily partitioned data between two parties.

2. Homomorphic Encryption:

Homomorpic property is a property of certain encryption algorithms where specific algebraic operations can be performed on plaintext by performing the operations on encryption messages without actually decrypting them. For example say we have two messages m1 and m2, the encryption of message is denoted by E(m1) and E(m2) then operation m1m2 can be performed using E(m1) and E(m2) only without actually de- crypting the two messages.

3. Privacy Preserving Learning:

This guarantees more security and privacy against the intrusion by the other party. Data providers for machine learning are not willing to train the neural network with their data at the expense of privacy and even if they do participate in the training they might either remove some information from their data or can provide false information.

Hardware Requirements:

• System : Pentium IV 2.4 GHz.

• Hard Disk : 40 GB.

• Floppy Drive: 1.44 Mb.

• Monitor: 15 VGA Colour.

• Mouse: Logitech.

• Ram: 512 Mb.

Software Requirements:

• Operating system : - Windows 8.

• Coding Language: C#.net

• Data Base: SQL Server 2008