Create a scrapy exporter on the root of your scrapy project, we suppose the name of your project is my_project
, we can name this exporter: my_project_csv_item_exporter.py
from scrapy.conf import settings
from scrapy.contrib.exporter import CsvItemExporter
class MyProjectCsvItemExporter(CsvItemExporter):
def __init__(self, *args, **kwargs):
delimiter = settings.get('CSV_DELIMITER', ',')
kwargs['delimiter'] = delimiter
fields_to_export = settings.get('FIELDS_TO_EXPORT', [])
if fields_to_export :
kwargs['fields_to_export'] = fields_to_export
super(MyProjectCsvItemExporter, self).__init__(*args, **kwargs)
In settings.py
import this exporter and set the fields to export and the order to follow, like this:
FEED_EXPORTERS = {
'csv': 'my_project.my_project_csv_item_exporter.MyProjectCsvItemExporter',
}
FIELDS_TO_EXPORT = [
'id',
'name',
'email',
'address'
]
For the CSV delimiter, you can set in settings.py
or when you execute the spider in CLI
In settings.py
CSV_DELIMITER = "\t" # For tab
OR in CLI
scrapy crawl my_spider -o output.csv -t csv -a CSV_DELIMITER="\t"
I just wanted to say thanks for sharing this! This seems much saner than the other approaches I've seen.