项目作者: python-cognitive-search

项目描述 :
Unofficial Python client for Azure cognitive search
高级语言: HTML
项目地址: git://github.com/python-cognitive-search/azuresearch.git
创建时间: 2019-01-13T09:53:59Z
项目社区:https://github.com/python-cognitive-search/azuresearch

开源协议:

下载


Build Status
Issues
MIT license


Cognitive Search - Python package

Create Indexes, indexers, suggesters, analyzers, scoring profiles, custom and predefined skills via Python.
Upload documents or manage data sources for Azure Search.
Create a data pipeline from source, through cognitive skills (either predefined or custom), into Azure Search

For this to work you need the following environment variables set:

  1. AZURE_SEARCH_API_KEY={a regular search api key}
  2. AZURE_SEARCH_ADMIN_API_KEY={a search admin api key}
  3. AZURE_SEARCH_URL=https://{your search service name}.search.windows.net

In addition, a Cognitive Search resource is needed in case you use Azure Cognitive Search. For more info: https://docs.microsoft.com/en-us/azure/search/cognitive-search-attach-cognitive-services

Features:

  1. Define fields and indexes through Python
  2. Deine skills and skillsets: Predefined Cognitive Search skills and custom skills (WebAPI skills)
  3. Define analyzers (custom analyzers and predefined analyzers)
  4. Define scoring profiles, suggesters
  5. Upload documents to Azure Search
  6. Manage data sources

originally forked from https://github.com/python-azure-search/python-azure-search

Example usage (WIP):

  1. #create datasource. json holds the datasource params (name, connection string etc.)
  2. datasource = DataSource.load(name="datasource",connection_string="xxx",container_name="cont")
  3. datasource.delete_if_exists()
  4. datasource.create()
  5. # define fields and index
  6. field1 = StringField("id",key=True)
  7. field2 = CollectionField("keyPhrases")
  8. field3 = StringField("content")
  9. index = Index("my-index",fields = [field1,field2,field3])
  10. index.delete_if_exists()
  11. index.create()
  12. # Define skills
  13. keyph_skill = KeyPhraseExtractionSkill()
  14. skillset = Skillset(skills=[keyph_skill],
  15. name="my-skillset",
  16. description="skillset with one skill",
  17. cognitive_services_key="YOUR_COG_SERVICES_KEY")
  18. skillset.delete_if_exists()
  19. skillset.create()
  20. ## Define Indexer
  21. indexer = Indexer(name="my-indexer", data_source_name=datasource.name,
  22. target_index_name=index.name, skillset_name=skillset.name)
  23. indexer.delete_if_exists()
  24. indexer.create()
  25. indexer_status = ""
  26. last_run_status = None
  27. while indexer_status != "error" and (last_run_status is None or last_run_status == "inProgress"):
  28. status = indexer.get_status()
  29. indexer_status = status.get("status")
  30. last_run_status = status.get("lastResult")
  31. if last_run_status is not None:
  32. last_run_status = last_run_status.get("status")
  33. print("last run status: " + last_run_status)
  34. print("indexer status is: " + indexer_status)
  35. time.sleep(3) # wait for 3 seconds until rechecking
  36. assert indexer_status == "running"
  37. assert last_run_status == "success"
  38. indexer.verify()
  39. ## Search something
  40. res = index.search("Microsoft")
  41. print(res)
  42. ## Delete all
  43. datasource.delete()
  44. index.delete()
  45. skillset.delete()
  46. indexer.delete()