apache - Error in remote streaming pdf files to Solr -
i trying stream remote files solr indexing using stream.url parameter as
curl 'http://localhost:8983/solr/update/csv?stream.url=http://www.artofproblemsolving.com/resources/papers/satont.pdf&stream.contenttype=application/pdf;charset=utf-8'
following solution here remote streaming solr. however, solr server throws error
<?xml version="1.0" encoding="utf-8"?> <response> <lst name="responseheader"> <int name="status">400</int> <int name="qtime">518</int> </lst> <lst name="error"> <str name="msg">document missing mandatory uniquekey field: id</str><int name="code">400</int> </lst> </response>
i tried looking in solr documentation , wiki pages couldn't find single example. appreciated.
update
here schema.xml file - http://pastebin.com/akmrud9n
the problem there 1 field, i.e., id
required="true" multivalued="false"
properties , being used uniquekey
as
<uniquekey>id</uniquekey>
and there must field set uniquekey
else solr remote streaming doesn't work. field should use instead of id
then?
you trying send pdf file legacy csv import endpoint. so, strange things , complains.
you want use extract handler. covers lot of information, including giving example pdf file , setting id explicitly:
Comments
Post a Comment