ViPer as a ground truth labeling software and protocol has been widely used in computer vision community, such as UCF Youtube dataset. However, since the ground truth information has been stored in XML format, for those who are not familiar with XML and using MATLAB for experiments, this protocol may raise some difficulty.
This blog will give an example of how to read the XML ground truth information into MATLAB.
The example xml file will be the 'v_biking_13_05.xgtf' from UCF Youtube dat set, with bounding box annotation (http://www.cs.ucf.edu/~liujg/YouTube_Action_dataset.html).
1. download viper-gt.jar from http://viper-toolkit.sourceforge.net, click
ViPER Light Distribution (Recommended) [May 25, 2005] under Software section.
In MATLAB type in the following command
>> javaaddpath(' ViPER_Folder/viper-gt.jar')
2. Type in the following commands to obtain an viper object, replace the filename as your v_biking_13_05.xgtf with the path.
>> docBuilder = DocumentBuilderFactory.newInstance();
>> docBuilder.setNamespaceAware(true);
>> docElement = docBuilder.newDocumentBuilder().parse(filename).getDocumentElement();
>> parser = viper.api.impl.ViperParser;
>> objviper = parser.parseDoc(docElement);
3. Obtain a string of bounding box information: (The getChild() may vary depend on different xgtf files, but at least for this example file it works)
str = char(objviper.getChild(1).getChild(0).getChild(1).getAttribute('Location').toString);
4. parse this str as a n x 4 matrix containing BoundingBox Information
The following code should be able to help you to do this job.
function BB = parseStr(str)
dq = strfind(str, '"'); % double quote locations
bn = strfind(str, ')'); % brace number
if ( length(dq) / 2 ~= length(bn) )
error('dq should be equal to 2 * bn');
end
BB = [];
for i = 1 : length(bn)
tmpBB = parseOneSeg(str( dq( (i-1) * 2 + 1) + 1 : bn(i) - 1) );
BB = cat(1, BB, tmpBB);
end
end
function tmpBB = parseOneSeg(str)
dq = strfind(str, '"');
BB = parseBB(str(1 : dq - 1) );
strNum = str2num( str( strfind(str, '[') + 1 : strfind(str, ',') - 1) );
endNum = str2num( str( strfind(str, ', ') + 2 : length(str) ) );
tmpBB = repmat(BB, endNum - strNum, 1);
end
function BB = parseBB(str)
BB = zeros(1, 4);
sp = strfind(str, ' '); % space location
sp = [0 sp length(str)+1];
for i = 1 : 4
BB(i) = str2num( str( sp(i)+1 : sp(i+1) - 1) );
end
end
5. Last you may want to visualize the bounding box
The following small script may help you to do that.
function readViper_visualize(BB, videoName)
video = VideoReader(videoName);
nFrame = video.NumberOfFrames;
nRow = size(BB, 1);
n = min(nFrame, nRow);
% BB = BB(:, [3 4 1 2]);
for i = 1 : n
I = read(video, i);
imshow(I);
rectangle('Position', BB(i, :), 'EdgeColor', 'r', 'LineWidth', 3);
title(['Frame ' num2str(i)]);
end