{"id":5779,"date":"2024-09-05T15:47:00","date_gmt":"2024-09-05T07:47:00","guid":{"rendered":"https:\/\/www.aisharenet.com\/?p=5779"},"modified":"2024-09-05T15:47:00","modified_gmt":"2024-09-05T07:47:00","slug":"tf-id","status":"publish","type":"post","link":"https:\/\/www.kdjingpai.com\/pt\/tf-id\/","title":{"rendered":"TF-ID\uff1a\u5b66\u672f\u8bba\u6587\u8868\u683c\/\u56fe\u50cf\u8bc6\u522b\u5de5\u5177"},"content":{"rendered":"<p>TF-ID\uff08Table\/Figure IDentifier\uff09\u662f\u4e00\u4e2a\u4e13\u95e8\u7528\u4e8e\u4ece\u5b66\u672f\u8bba\u6587\u4e2d\u63d0\u53d6\u8868\u683c\u548c\u56fe\u50cf\u7684\u5bf9\u8c61\u68c0\u6d4b\u6a21\u578b\u5bb6\u65cf\u3002\u8be5\u9879\u76ee\u7531Yifei Hu\u521b\u5efa\uff0c\u5e76\u5728GitHub\u4e0a\u5f00\u6e90\u3002TF-ID\u6a21\u578b\u7ecf\u8fc7\u5fae\u8c03\uff0c\u53ef\u4ee5\u8bc6\u522b\u5e76\u63d0\u53d6\u5b66\u672f\u8bba\u6587\u4e2d\u7684\u8868\u683c\u548c\u56fe\u50cf\uff0c\u652f\u6301\u5e26\u6709\u6216\u4e0d\u5e26\u6709\u6807\u9898\u6587\u672c\u7684\u63d0\u53d6\u3002\u8be5\u9879\u76ee\u63d0\u4f9b\u4e86\u5b8c\u6574\u7684\u8bad\u7ec3\u4ee3\u7801\u3001\u6a21\u578b\u6743\u91cd\u548c\u4eba\u5de5\u6807\u6ce8\u7684\u6570\u636e\u96c6\uff0c\u6240\u6709\u5185\u5bb9\u5747\u5728MIT\u8bb8\u53ef\u8bc1\u4e0b\u5f00\u6e90\u3002<\/p>\n<p>&nbsp;<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter  wp-image-5780\" title=\"TF-ID\uff1a\u5b66\u672f\u8bba\u6587\u8868\u683c\/\u56fe\u50cf\u8bc6\u522b\u5de5\u5177-1\" src=\"https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2024\/09\/46fc66cfdda1efb.jpg\" alt=\"TF-ID\uff1a\u5b66\u672f\u8bba\u6587\u8868\u683c\/\u56fe\u50cf\u8bc6\u522b\u5de5\u5177-1\" width=\"844\" height=\"474\" srcset=\"https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2024\/09\/46fc66cfdda1efb.jpg 1968w, https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2024\/09\/46fc66cfdda1efb-300x168.jpg 300w, https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2024\/09\/46fc66cfdda1efb-1024x574.jpg 1024w, https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2024\/09\/46fc66cfdda1efb-768x431.jpg 768w, https:\/\/www.kdjingpai.com\/wp-content\/uploads\/2024\/09\/46fc66cfdda1efb-1536x862.jpg 1536w\" sizes=\"auto, (max-width: 844px) 100vw, 844px\" \/><\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<h2>\u529f\u80fd\u5217\u8868<\/h2>\n<ul>\n<li>\u63d0\u53d6\u5b66\u672f\u8bba\u6587\u4e2d\u7684\u8868\u683c\u548c\u56fe\u50cf<\/li>\n<li>\u652f\u6301\u5e26\u6709\u6216\u4e0d\u5e26\u6709\u6807\u9898\u6587\u672c\u7684\u63d0\u53d6<\/li>\n<li>\u63d0\u4f9b\u5b8c\u6574\u7684\u8bad\u7ec3\u4ee3\u7801\u548c\u6a21\u578b\u6743\u91cd<\/li>\n<li>\u652f\u6301\u4ecePDF\u6587\u4ef6\u4e2d\u63d0\u53d6\u8868\u683c\u548c\u56fe\u50cf<\/li>\n<li>\u63d0\u4f9b\u591a\u79cd\u6a21\u578b\u7248\u672c\u4ee5\u9002\u5e94\u4e0d\u540c\u9700\u6c42<\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<h2>\u4f7f\u7528\u5e2e\u52a9<\/h2>\n<h3>\u5b89\u88c5\u6d41\u7a0b<\/h3>\n<ol>\n<li>\u514b\u9686\u4ed3\u5e93\uff1a\n<pre><code class=\"language-bash\">git <span class=\"hljs-built_in\">clone<\/span> https:\/\/github.com\/ai8hyf\/TF-ID\r\n<span class=\"hljs-built_in\">cd<\/span> TF-ID\r\n<\/code><\/pre>\n<\/li>\n<li>\u4e0b\u8f7d\u6570\u636e\u96c6\uff1a \u4eceHugging Face\u4e0b\u8f7d\u6570\u636e\u96c6\u5e76\u89e3\u538b\u5230\u76f8\u5e94\u76ee\u5f55\u3002\n<pre><code class=\"language-bash\">wget https:\/\/huggingface.co\/datasets\/yifeihu\/TF-ID-arxiv-papers\/resolve\/main\/arxiv_paper_images.zip\r\nunzip arxiv_paper_images.zip -d .\/images\r\n<\/code><\/pre>\n<\/li>\n<li>\u8f6c\u6362\u6570\u636e\u96c6\u683c\u5f0f\uff1a\n<pre><code class=\"language-bash\">python coco_to_florence.py\r\n<\/code><\/pre>\n<\/li>\n<li>\u8bad\u7ec3\u6a21\u578b\uff1a\n<pre><code class=\"language-bash\">accelerate launch train.py\r\n<\/code><\/pre>\n<\/li>\n<\/ol>\n<h3>\u4f7f\u7528\u6d41\u7a0b<\/h3>\n<ol>\n<li>\u63d0\u53d6\u5355\u4e2a\u56fe\u50cf\u4e2d\u7684\u8868\u683c\u548c\u56fe\u50cf\uff1a\n<pre><code class=\"language-python\">python inference.py --image_path path\/to\/image.png\r\n<\/code><\/pre>\n<\/li>\n<li>\u4ecePDF\u6587\u4ef6\u4e2d\u63d0\u53d6\u6240\u6709\u8868\u683c\u548c\u56fe\u50cf\uff1a\n<pre><code class=\"language-python\">python pdf_to_table_figures.py --pdf_path path\/to\/paper.pdf --output_dir .\/sample_output\r\n<\/code><\/pre>\n<\/li>\n<\/ol>\n<h3>\u8be6\u7ec6\u64cd\u4f5c\u6d41\u7a0b<\/h3>\n<ol>\n<li><strong>\u63d0\u53d6\u5355\u4e2a\u56fe\u50cf\u4e2d\u7684\u8868\u683c\u548c\u56fe\u50cf<\/strong>\uff1a\n<ul>\n<li>\u5c06\u56fe\u50cf\u8def\u5f84\u4f20\u9012\u7ed9<code>inference.py<\/code>\u811a\u672c\uff0c\u8be5\u811a\u672c\u5c06\u4f7f\u7528\u9ed8\u8ba4\u7684TF-ID-large\u6a21\u578b\u63d0\u53d6\u56fe\u50cf\u4e2d\u7684\u8868\u683c\u548c\u56fe\u50cf\u3002<\/li>\n<li>\u63d0\u53d6\u7ed3\u679c\u5c06\u4ee5\u8fb9\u754c\u6846\u7684\u5f62\u5f0f\u8fd4\u56de\uff0c\u6807\u8bc6\u51fa\u56fe\u50cf\u4e2d\u7684\u8868\u683c\u548c\u56fe\u50cf\u4f4d\u7f6e\u3002<\/li>\n<\/ul>\n<\/li>\n<li><strong>\u4ecePDF\u6587\u4ef6\u4e2d\u63d0\u53d6\u6240\u6709\u8868\u683c\u548c\u56fe\u50cf<\/strong>\uff1a\n<ul>\n<li>\u5c06PDF\u6587\u4ef6\u8def\u5f84\u4f20\u9012\u7ed9<code>pdf_to_table_figures.py<\/code>\u811a\u672c\uff0c\u8be5\u811a\u672c\u5c06\u63d0\u53d6PDF\u6587\u4ef6\u4e2d\u7684\u6240\u6709\u8868\u683c\u548c\u56fe\u50cf\uff0c\u5e76\u5c06\u88c1\u526a\u540e\u7684\u56fe\u50cf\u4fdd\u5b58\u5230\u6307\u5b9a\u7684\u8f93\u51fa\u76ee\u5f55\u3002<\/li>\n<li>\u9ed8\u8ba4\u4f7f\u7528TF-ID-large\u6a21\u578b\u8fdb\u884c\u63d0\u53d6\uff0c\u53ef\u4ee5\u901a\u8fc7\u4fee\u6539\u811a\u672c\u4e2d\u7684<code>model_id<\/code>\u53c2\u6570\u5207\u6362\u5230\u5176\u4ed6\u6a21\u578b\u7248\u672c\u3002<\/li>\n<\/ul>\n<\/li>\n<li><strong>\u8bad\u7ec3\u6a21\u578b<\/strong>\uff1a\n<ul>\n<li>\u514b\u9686\u4ed3\u5e93\u5e76\u4e0b\u8f7d\u6570\u636e\u96c6\u540e\uff0c\u4f7f\u7528<code>coco_to_florence.py<\/code>\u811a\u672c\u5c06\u6570\u636e\u96c6\u8f6c\u6362\u4e3aFlorence 2\u683c\u5f0f\u3002<\/li>\n<li>\u4f7f\u7528<code>accelerate launch train.py<\/code>\u547d\u4ee4\u542f\u52a8\u6a21\u578b\u8bad\u7ec3\uff0c\u8bad\u7ec3\u8fc7\u7a0b\u4e2d\u4f1a\u4fdd\u5b58\u68c0\u67e5\u70b9\u6587\u4ef6\u3002<\/li>\n<\/ul>\n<\/li>\n<\/ol>\n","protected":false},"excerpt":{"rendered":"<p>TF-ID\uff08Table\/Figure IDentifier\uff09\u662f\u4e00\u4e2a\u4e13\u95e8\u7528\u4e8e\u4ece\u5b66\u672f\u8bba\u6587\u4e2d\u63d0\u53d6\u8868\u683c\u548c\u56fe\u50cf\u7684\u5bf9\u8c61\u68c0\u6d4b\u6a21\u578b\u5bb6\u65cf\u3002\u8be5\u9879\u76ee\u7531Yifei Hu\u521b\u5efa\uff0c\u5e76\u5728GitHub\u4e0a\u5f00\u6e90\u3002TF-ID\u6a21\u578b\u7ecf\u8fc7\u5fae\u8c03\uff0c\u53ef\u4ee5\u8bc6\u522b\u5e76\u63d0\u53d6\u5b66\u672f\u8bba\u6587\u4e2d\u7684\u8868\u683c\u548c\u56fe\u50cf\uff0c\u652f\u6301&#8230;<\/p>\n","protected":false},"author":1,"featured_media":60970,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[20],"tags":[230],"class_list":["post-5779","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-tool","tag-aikaiyuanxiangmu"],"_links":{"self":[{"href":"https:\/\/www.kdjingpai.com\/pt\/wp-json\/wp\/v2\/posts\/5779","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.kdjingpai.com\/pt\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.kdjingpai.com\/pt\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.kdjingpai.com\/pt\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.kdjingpai.com\/pt\/wp-json\/wp\/v2\/comments?post=5779"}],"version-history":[{"count":0,"href":"https:\/\/www.kdjingpai.com\/pt\/wp-json\/wp\/v2\/posts\/5779\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.kdjingpai.com\/pt\/wp-json\/wp\/v2\/media\/60970"}],"wp:attachment":[{"href":"https:\/\/www.kdjingpai.com\/pt\/wp-json\/wp\/v2\/media?parent=5779"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.kdjingpai.com\/pt\/wp-json\/wp\/v2\/categories?post=5779"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.kdjingpai.com\/pt\/wp-json\/wp\/v2\/tags?post=5779"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}