indexing epub content into solr solr schema 1 document per chapter, then collapse multivalued fields: chapter_title and chapter_text, keeping order. text extraction how to extract structured text from epub tika pandoc epub->tei or epub->docbook custom epub reader