update instructions for TER, OGC, Performance Fees
This commit is contained in:
parent
276ff93a1d
commit
352886ade2
|
|
@ -35,6 +35,7 @@
|
||||||
"Ratio de gastos totales"
|
"Ratio de gastos totales"
|
||||||
],
|
],
|
||||||
"german": [
|
"german": [
|
||||||
|
"TER inkl. Performance-Fee in % **)",
|
||||||
"Gesamtgebühren",
|
"Gesamtgebühren",
|
||||||
"Kostenpauschale",
|
"Kostenpauschale",
|
||||||
"Gesamtkostenquote",
|
"Gesamtkostenquote",
|
||||||
|
|
@ -242,6 +243,7 @@
|
||||||
"Gastos corrientes en porcentaje 3)"
|
"Gastos corrientes en porcentaje 3)"
|
||||||
],
|
],
|
||||||
"german": [
|
"german": [
|
||||||
|
"Ongoing Charges inkl.Performance-Fee in % **)",
|
||||||
"Laufende Kosten",
|
"Laufende Kosten",
|
||||||
"Laufende Kosten in Prozent",
|
"Laufende Kosten in Prozent",
|
||||||
"Laufende Kosten 1)",
|
"Laufende Kosten 1)",
|
||||||
|
|
|
||||||
|
|
@ -68,12 +68,19 @@
|
||||||
"- With \"TER with Performance Fee\" and \"Fund TER\", pick up the values from \"TER with Performance Fee\".",
|
"- With \"TER with Performance Fee\" and \"Fund TER\", pick up the values from \"TER with Performance Fee\".",
|
||||||
"- With \"TER (en %) (with performance)\" and \"TER(en %) (without performance)\", pick up the values from \"TER (en %) (with performance)\".",
|
"- With \"TER (en %) (with performance)\" and \"TER(en %) (without performance)\", pick up the values from \"TER (en %) (with performance)\".",
|
||||||
"- With \"TER including Performance Fee\" and \"TER excluding Performance Fee\", pick up the values from \"TER including Performance Fee\".",
|
"- With \"TER including Performance Fee\" and \"TER excluding Performance Fee\", pick up the values from \"TER including Performance Fee\".",
|
||||||
|
"- With \"TER inkl. Performance-Fee in % **)\" and \"TER exkl. Performance-Fee in % **)\", pick up the values from \"TER inkl. Performance-Fee in % **)\".",
|
||||||
|
"- With \"TER inkl. Performance-Fee in % **)\" and \"TER inkl. Performance-Fee in % (inkl. Zielfonds)\", pick up the values from \"TER inkl. Performance-Fee in % **)\".",
|
||||||
"- With both of \"Synthetic TER\" and \"Fund TER\", if \"Synthetic TER\" with value(s), pick up the value(s) from \"Synthetic TER\", otherwise, pick up the value(s) from \"Fund TER\".",
|
"- With both of \"Synthetic TER\" and \"Fund TER\", if \"Synthetic TER\" with value(s), pick up the value(s) from \"Synthetic TER\", otherwise, pick up the value(s) from \"Fund TER\".",
|
||||||
"- With both of \"Net TER\" and \"Capped Expense Ratio\", the priority is \"Net TER\", please exclude the column: \"Capped Expense Ratio\", only pick up the values from \"Net TER\".",
|
"- With both of \"Net TER\" and \"Capped Expense Ratio\", the priority is \"Net TER\", please exclude the column: \"Capped Expense Ratio\", only pick up the values from \"Net TER\".",
|
||||||
"- With \"Gross TER\", \"Waiver\", \"Net TER\", \"Capped Expense Ratio\" as column titles, pick up the values from \"Net TER\".",
|
"- With \"Gross TER\", \"Waiver\", \"Net TER\", \"Capped Expense Ratio\" as column titles, pick up the values from \"Net TER\".",
|
||||||
"- If exist Gross TER as column title, please ignore this title",
|
"- If exist Gross TER as column title, please ignore this title",
|
||||||
"Please ignore TER values which with the exception of performance fees or excluded performance fees."
|
"Please ignore TER values which with the exception of performance fees or excluded performance fees."
|
||||||
],
|
],
|
||||||
|
"ogc": [
|
||||||
|
"If there are multiple OGC value columns, here is the priority rules:",
|
||||||
|
"- With \"Ongoing Charges inkl. Performance-Fee in % **)\" and \"Ongoing Charges exkl. Performance-Fee in % **)\", pick up the values from \"Ongoing Charges inkl. Performance-Fee in % **)\".",
|
||||||
|
"- With \"Ongoing Charges inkl. Performance-Fee in % **)\" and \"Ongoing Charges inkl. Performance-Fee in % (inkl. Zielfonds)\", pick up the values from \"Ongoing Charges inkl. Performance-Fee in % **)\"."
|
||||||
|
],
|
||||||
"performance_fee": [
|
"performance_fee": [
|
||||||
"The performance fees should not be the presence of the rates at which the performance fees are calculated."
|
"The performance fees should not be the presence of the rates at which the performance fees are calculated."
|
||||||
]
|
]
|
||||||
|
|
@ -162,12 +169,20 @@
|
||||||
"The performance fees should be:",
|
"The performance fees should be:",
|
||||||
"TER including performance fees - TER excluding performance fees.",
|
"TER including performance fees - TER excluding performance fees.",
|
||||||
"Here is the example:",
|
"Here is the example:",
|
||||||
|
"Example 1:",
|
||||||
"-----Example Start-----",
|
"-----Example Start-----",
|
||||||
"GAMAX FUNDS FCP\\nClass\\nTER (excluding Performance Fees)\\nTER (including Performance Fees)\\nGAMAX FUNDS - ASIA PACIFIC\\nA\\n2.07%\\n2.07%\\n",
|
"GAMAX FUNDS FCP\\nClass\\nTER (excluding Performance Fees)\\nTER (including Performance Fees)\\nGAMAX FUNDS - ASIA PACIFIC\\nA\\n2.07%\\n2.07%\\n",
|
||||||
"-----Example End-----",
|
"-----Example End-----",
|
||||||
"The output should be:",
|
"The output should be:",
|
||||||
"{\"data\": [{\"fund name\": \"GAMAX FUNDS - ASIA PACIFIC\", \"share name\": \"A\", \"ter\": 2.07, \"performance_fee\": 0}]}",
|
"{\"data\": [{\"fund name\": \"GAMAX FUNDS - ASIA PACIFIC\", \"share name\": \"A\", \"ter\": 2.07, \"performance_fee\": 0}]}",
|
||||||
"The performance fees value is TER (including Performance Fees) - TER (excluding Performance Fees) = 2.07 - 2.07 = 0",
|
"The performance fees value is TER (including Performance Fees) - TER (excluding Performance Fees) = 2.07 - 2.07 = 0",
|
||||||
|
"Example 2:",
|
||||||
|
"-----Example Start-----",
|
||||||
|
"D/S Strategie ausgewogen\\nErgänzende Angaben für Anleger in der Schweiz zum 31. Dezember 2020 (ungeprüft)\\nFonds\\nTER exkl. \\nPerformance-Fee in % **)\\nTER inkl. \\nPerformance-Fee in % **)\\nTER inkl. \\nPerformance-Fee in % (inkl. Zielfonds)\\n1,15\\n1,63\\n1,15\\n1,63\\nTER exkl.\\nPerformance-Fee in % (inkl. Zielfonds)",
|
||||||
|
"-----Example End-----",
|
||||||
|
"The output should be:",
|
||||||
|
"{\"data\": [{\"fund name\": \"D/S Strategie ausgewogen\", \"ter\": 1.15, \"performance_fee\": 0}]}",
|
||||||
|
"The performance fees value is TER inkl. Performance-Fee in % **) - TER exkl. Performance-Fee in % **) = 1,15 - 1,15 = 0",
|
||||||
"Case 2:",
|
"Case 2:",
|
||||||
"If some table is with three value columns: \"TER including performance fees\", \"TER excluding performance fees\", \"Performance fees\", ",
|
"If some table is with three value columns: \"TER including performance fees\", \"TER excluding performance fees\", \"Performance fees\", ",
|
||||||
"The Performance fees value in column: Performance fees, chould be \"-\", because of \"TER including performance fees\" - \"TER excluding performance fees\" = 0, ",
|
"The Performance fees value in column: Performance fees, chould be \"-\", because of \"TER including performance fees\" - \"TER excluding performance fees\" = 0, ",
|
||||||
|
|
@ -184,6 +199,30 @@
|
||||||
"The output should be:",
|
"The output should be:",
|
||||||
"{\"data\": [{\"fund name\": \"Pictet Corto Europe Long Short\", \"share name\": \"P EUR\", \"ter\": 1.98, \"performance_fee\": 0}]}"
|
"{\"data\": [{\"fund name\": \"Pictet Corto Europe Long Short\", \"share name\": \"P EUR\", \"ter\": 1.98, \"performance_fee\": 0}]}"
|
||||||
]
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"title": "Performance fees is part of OGC:",
|
||||||
|
"contents": [
|
||||||
|
"If exist both of \"Ongoing Charges including Performance Fee\" and \"Ongoing Charges excluding Performance Fee\",",
|
||||||
|
"The OGC should be \"Ongoing Charges including Performance Fee\".",
|
||||||
|
"The performance fees should be:",
|
||||||
|
"Ongoing Charges including Performance Fee - Ongoing Charges excluding Performance Fee.",
|
||||||
|
"Here is the example:",
|
||||||
|
"Example 1:",
|
||||||
|
"-----Example Start-----",
|
||||||
|
"GAMAX FUNDS FCP\\nClass\\Ongoing Charges (excluding Performance Fees)\\Ongoing Charges (including Performance Fees)\\nFund 1\\nShare A\\n1.50%\\n1.58%\\n",
|
||||||
|
"-----Example End-----",
|
||||||
|
"The output should be:",
|
||||||
|
"{\"data\": [{\"fund name\": \"Fund 1\", \"share name\": \"Share 1\", \"ogc\": 1.58, \"performance_fee\": 0.08}]}",
|
||||||
|
"The performance fees value is Ongoing Charges including Performance Fee - Ongoing Charges excluding Performance Fee = 1.58 - 1.50 = 0.08",
|
||||||
|
"Example 2:",
|
||||||
|
"-----Example Start-----",
|
||||||
|
"Fund1\\nOngoing Charges exkl. \\nPerformance-Fee in % **)\\nOngoing Charges exkl. \\nPerformance-Fee in % (inkl. Zielfonds)\\nOngoing Charges inkl. \\nPerformance-Fee in % **)\\nOngoing Charges inkl. \\nPerformance-Fee in % (inkl. Zielfonds)\\n1,15\\n1,35\\n1,20\\n1,35\\n",
|
||||||
|
"-----Example End-----",
|
||||||
|
"The output should be:",
|
||||||
|
"{\"data\": [{\"fund name\": \"Fund1\", \"ogc\": 1.20, \"performance_fee\": 0.05}]}",
|
||||||
|
"The performance fees value is Ongoing Charges inkl. Performance-Fee in % **) - Ongoing Charges exkl. Performance-Fee in % **) = 1.20 - 1.15 = 0.05"
|
||||||
|
]
|
||||||
}
|
}
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
|
|
|
||||||
4
main.py
4
main.py
|
|
@ -1194,12 +1194,12 @@ if __name__ == "__main__":
|
||||||
"534535767"
|
"534535767"
|
||||||
]
|
]
|
||||||
special_doc_id_list = check_db_mapping_doc_id_list
|
special_doc_id_list = check_db_mapping_doc_id_list
|
||||||
# special_doc_id_list = ["334584772"]
|
special_doc_id_list = ["334584772"]
|
||||||
output_mapping_child_folder = r"/data/emea_ar/output/mapping_data/docs/"
|
output_mapping_child_folder = r"/data/emea_ar/output/mapping_data/docs/"
|
||||||
output_mapping_total_folder = r"/data/emea_ar/output/mapping_data/total/"
|
output_mapping_total_folder = r"/data/emea_ar/output/mapping_data/total/"
|
||||||
re_run_extract_data = True
|
re_run_extract_data = True
|
||||||
re_run_mapping_data = True
|
re_run_mapping_data = True
|
||||||
force_save_total_data = True
|
force_save_total_data = False
|
||||||
calculate_metrics = False
|
calculate_metrics = False
|
||||||
|
|
||||||
extract_ways = ["text"]
|
extract_ways = ["text"]
|
||||||
|
|
|
||||||
|
|
@ -572,6 +572,47 @@
|
||||||
"list(document_mapping[\"ShareClassName\"].unique())"
|
"list(document_mapping[\"ShareClassName\"].unique())"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 1,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [],
|
||||||
|
"source": [
|
||||||
|
"import pymupdf4llm"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 2,
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"Processing ./data/emea_ar/pdf/501380553.pdf...\n",
|
||||||
|
"[ ] (0/47[ ] ( 1/47[= ] ( 2/4[== ] ( 3/47[=== ] ( 4/4[==== ] ( 5/47[===== ] ( 6/47[===== ] ( 7/4[====== ] ( 8/47[======= ] ( 9/4[======== ] (10/47[========= ] (11/4[========== ] (12/47[=========== ] (13/47[=========== ] (14/4[============ ] (15/47[============= ] (16/4[============== ] (17/47[=============== ] (18/4[================ ] (19/47[================= ] (20/47[================= ] (21/4[================== ] (22/47[=================== ] (23/4[==================== ] (24/47[===================== ] (25/4[====================== ] (26/4[====================== ] (27/47[======================= ] (28/4[======================== ] (29/47[========================= ] (30/4[========================== ] (31/47[=========================== ] (32/4[============================ ] (33/4[============================ ] (34/47[============================= ] (35/4[============================== ] (36/47[=============================== ] (37/4[================================ ] (38/47[================================= ] (39/4[================================== ] (40/4[================================== ] (41/47[=================================== ] (42/4[==================================== ] (43/47[===================================== ] (44/4[====================================== ] (45/47[======================================= ] (46/47[========================================] (47/47]\n"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"data": {
|
||||||
|
"text/plain": [
|
||||||
|
"107851"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"execution_count": 2,
|
||||||
|
"metadata": {},
|
||||||
|
"output_type": "execute_result"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"md_text = pymupdf4llm.to_markdown(r\"./data/emea_ar/pdf/501380553.pdf\")\n",
|
||||||
|
"\n",
|
||||||
|
"# now work with the markdown text, e.g. store as a UTF8-encoded file\n",
|
||||||
|
"import pathlib\n",
|
||||||
|
"pathlib.Path(r\"./data/emea_ar/output/markdown/501380553.md\").write_bytes(md_text.encode())"
|
||||||
|
]
|
||||||
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "code",
|
"cell_type": "code",
|
||||||
"execution_count": null,
|
"execution_count": null,
|
||||||
|
|
|
||||||
Loading…
Reference in New Issue