Journal Title
Title of Journal: Multimed Tools Appl
|
Abbravation: Multimedia Tools and Applications
|
|
|
|
|
Authors: Yongjian Hu ChangTsun Li Zhimao Lai
Publish Date: 2014/05/22
Volume: 74, Issue: 18, Pages: 7405-7428
Abstract
Fast camera fingerprint search is an important issue for source camera identification in realworld applications So far there has been little work done in this area In this paper we propose a novel fast search algorithm We use global information derived from the relationship between the query fingerprint/digest and the reference fingerprints/digests in the database to guide fast search This information can provide more accurate and robust clues for the selection of candidate matching database fingerprints Because the quality of query fingerprints may degrade or vary in realistic applications the construction of robust search clues is significant To speed up the search process we adopt a lookup table that is built on the separatechaining hash table The proposed algorithm has been tested using query images from realworld photos Experiments demonstrate that our algorithm can well adapt to query fingerprints with different quality It can achieve higher detection rates with lower computational cost than the traditional bruteforce search algorithm and a pioneering fast search algorithm in literatureThere have been a number of forensic methods proposed for establishing the relationship and linkage between digital images/videos and the imaging devices responsible for their creation eg 4 8 18 22 Lukas et al 18 first proposed to use the imaging sensor pattern noise SPN such as the PhotoResponse NonUniformity PRNU of imaging sensors as camera/camcorder fingerprint for source camera/camcorder identification Since then a variety of SPNbased algorithms have been presented Some explore new ways of extracting camera fingerprints and extending fingerprint applications from source camera identification to image forgery detection eg 4 8 while others pay attention to improving quality of fingerprints eg 1 12 14 and the detection effectiveness eg 11 15 16 17 In the meantime the issues relating to SPNbased camera identification are raised for example the robustness of SPNbased camera identification 21 the trustworthiness of device SPNs 3 6 18 and the feasibility and reliability of applying SPNbased identification to a sizable fingerprint database 9Although remarkable progress has been made in developing source camera identification for realworld applications some areas such as fast camera identification still need further research Anticipating future use of camera identification from sensor fingerprint by law enforcement and government fast camera identification plays an important role in two typical scenarios i for a large database of N reference camera fingerprints forensic analysts may want to determine whether it contains the fingerprint of the camera that took a given query/test image ii given tens of thousands of query images forensic analysts may want to search through a small or moderatesize reference fingerprint database for the identification of source cameras of those query images and group those query images according to their source cameras The solutions to i and ii are basically the same Therefore we do not distinguish these two application scenarios in this paper The search for a match between the query and the database fingerprints can be formulated as a multiplehypothesis testing problem with a cross correlation detector applied to each fingerprint in the database The sequential search/bruteforce search is a simple and traditional method Given a database of N fingerprints the average number of search rounds for the bruteforce search algorithm BFSA is N/2 Since current commercial cameras are with the resolution of several megapixels the search process may be intractably long if a large number of correlationbased comparisons have to be made How to accurately and efficiently match query fingerprint and reference fingerprints in the database is thus of paramount importance in this applicationFew papers have been published in this area References 2 and 10 are two early works in literature In 2 a tree structured vector quantizationbased fast search algorithm is presented Considering that each camera has a unique SPN fingerprint and each SPN fingerprint can be modeled as an independent and identically distributed noise signal additive to an image this tree structured algorithm does fingerprint matching on a group rather than individual basis Before the search starts the reference fingerprints in the camera database are evenly divided into two groups The sum of fingerprints of each group is viewed as a composite fingerprint and becomes a node of the binary tree Each group is further divided into two subgroups and the composite fingerprint of each subgroup is calculated and treated as a node at a new level of the binary tree Such a binary division continues until no division is possible Finally each subgroup consists of only one reference fingerprint Given a query fingerprint the search starts from top to bottom by picking the node that yields the higher correlation value between the query and the composite fingerprints at each level It is reported that the logarithmic decrease in the search complexity is achieved However when applying this scheme to the identification of source camera in a reference camera fingerprint database the following problem will be raised if a query image is not taken by any reference camera in the database ie the fingerprint of the query image does not reside in the database no matter which tree branch is chosen the decision is wrong One solution to this dilemma is to introduce a decision threshold to let the algorithm proceed along the node where the correlation value is greater than the threshold But query images of various contents from different cameras or even from the same camera often produce query fingerprints with different levels of quality This gives rise to increased correlation variances making the thresholding solution almost infeasible in practical applicationsAnother early fast search algorithm is the Approximate Rank Matching Search ARMS 10 The large number of correlation computations is thought of as a major obstacle to efficient search To overcome this problem the ARMS takes two measures i introduce fingerprint digests to reduce the complexity of cross correlation computation and ii select candidate matching database digests ie reference digests in the database before computing every correlation value The latter can reduce the number of correlation computations in the search process A digest is a compact subset of the original fingerprint that can sufficiently characterize the original fingerprint The elements of the digest are called largemagnitude components or significant components of the fingerprint in 11 and 16 In 10 the fingerprint digest is explicitly defined as the subset that consists of an ordered list of the k ≪n largest components of the ncomponent fingerprint Practically some digest elements are probably defective pixels like hot pixels with large positive values or dead pixels with large negative values on the camera sensor The ARMS is inspired by Spearman Rank Correlation 5 The digest elements that contribute most to the Spearman rank correlation coefficient between the query and database digests are defined as the most influential indices/elements A fast match between the query and database digests relies on the rank information derived from the most influential elements Those elements are more likely located in the beginning of the digests so the ARMS starts the search process from the first largest element of the query digest Using coordinates of the current element the ARMS looks for the digest elements at the same spatial locations on database fingerprints The database digests with nonzero elements are thought of as potential candidate matching digests These potential candidates are further selected by the innerloop operations Finally a priority is given to the selected ones during the search If the first search round fails ie the correlation values between the query and the selected database digests are less than a predefined decision threshold the ARMS goes to the second element of the query digest and repeats the search steps This process proceeds until a match is found or the predefined search time limit is exceeded Since the search clue for each search round is derived from one element of the query digest a time we call the ARMS a local informationbased fast search algorithm Usually local information is sensitive to noise In 10 Goljan et al only gave the results of the ARMS under the assumption that both query and reference fingerprints are good quality fingerprints However the quality of query fingerprints cannot be guaranteed in realworld applications due to different image sources and devicedependent propertiesIn this paper we propose a new fingerprint digestbased fast search algorithm To better handle practical query images we propose to use global information for guiding fast search In general global features are more robust than local features against noise During experiments we observed the following phenomenon when comparing the elements/components at the same spatial locations from a pair of camera fingerprints ie SPN signals the pair from the same camera has more elements with matching signs “+” or “−” than the pair from different cameras This phenomenon becomes more apparent with the increase of fingerprint quality Such observation motivates us to use the number of elements with matching signs between the query fingerprint and the database fingerprints for fast search cluesThe rest of the paper is organized as follows Section 2 analyzes the problem which we face when designing fast search algorithms and describes our initial idea of improving the search order Section 3 introduces our fast search algorithm We elaborate on the measurement of search priority the lookup table and the construction of the Search Priority Array SPA In Section 4 we use experiments to demonstrate the performance and advantages of the proposed algorithm In Section 5 we draw conclusionsTo increase search efficiency current SPNbased fast search algorithms focus on devising a data structure which can select the most likely matching reference fingerprints from the database without resorting to a large amount of correlation computations The major attention of 2 and 10 is paid to the construction of this data structure For realistic applications however there is another challenge ie robustness We use a simple example to explain the impact of fingerprint quality on search results Suppose a database consists of nine reference camera fingerprints W i i = 1 2 … 9 For a given query fingerprint X we further assume that W 7 corresponds to the camera which is responsible for X That is W 7 is the correct matching database fingerprint Let ρ i be the cross correlation between X and W i i = 1 2 … 9 If a correlation value is greater than the decision threshold t k a matching database fingerprint is found We assume that there are two cases which will occur to the database a ρ i ≤ t k if i ≠ 7 otherwise ρ i t k b ρ i ≤ t k i ≠ 2 7 9 otherwise ρ i t k In case a the statistical detector can readily make a correct decision as there is only one correlation value greater than the threshold In case b however the situation is more complex since there are three correlation values which exceed the threshold even though ρ 2 and ρ 9 are marginally greater than t k In this case the search order becomes significant because it affects the output of the search For a sequential search algorithm its detector may choose W 2 as the matching fingerprint rather than W 7 A wrong decision is thus made In fact the poor signal quality of fingerprints often results in large variances of correlation values even for the fingerprints that come from the images taken by the same camera We can readily make database fingerprints have high quality and be at the same quality level eg the same signaltonoise ratio SNR but in realworld application scenarios however we can hardly do the same thing to query fingerprints as they are usually extracted from the images which come from miscellaneous sources and most probably have quite different contents Apparently the situation described in case b occurs to the database more frequently This analysis implies that we must consider the requirement of robustness when we design fast search algorithms This challenge was not given enough attention in previous works eg 2 and 10
Keywords:
.
|
Other Papers In This Journal:
|