This article evaluates RECKONING's generalizability on the real-world multi-hop logical reasoning task, FOLIO.This article evaluates RECKONING's generalizability on the real-world multi-hop logical reasoning task, FOLIO.

RECKONING: Reasoning through Dynamic Knowledge Encoding: Generalization to Real-World knowledge

2025/10/25 00:41

Abstract and 1. Introduction

  1. Background

  2. Method

  3. Experiments

    4.1 Multi-hop Reasoning Performance

    4.2 Reasoning with Distractors

    4.3 Generalization to Real-World knowledge

    4.4 Run-time Analysis

    4.5 Memorizing Knowledge

  4. Related Work

  5. Conclusion, Acknowledgements, and References

\ A. Dataset

B. In-context Reasoning with Distractors

C. Implementation Details

D. Adaptive Learning Rate

E. Experiments with Large Language Models

4.3 Generalization to Real-World knowledge

To investigate how generalizable our method is to real-world knowledge beyond the synthetic setting, we evaluate RECKONING on a more real-world multi-hop logical reasoning task, FOLIO [29], and report the result in Table 2. The dataset has a rich vocabulary, diverse logic patterns, and abundant language variations. It has been shown to challenge LLMs in both supervised fine-tuning and in-context learning settings. We fine-tune the GPT-2 model following the in-context reasoning setting as the baseline. As before, we train the GPT-2 model and RECKONING using the multi-task objective. We also compare to more advanced baselines, including GPT-3.5 (text-davinci-003 [55]) and ChatGPT(gpt-3.5-turbo[2]), two popular large language models with around 175B parameters. For these two large models, we evaluate both in the zero-shot and few-shot settings. In the few-shot setting, we prompt the model with 8 single-task examples randomly sampled from the training set to perform in-context learning. We find that RECKONING’s performance (which is initiated here from GPT-2) is better than the GPT-2 in-context reasoning baseline. Compared to the two advanced large language models, RECKONING outperforms them by a significant margin (12% 0-shot and 7% 8-shot). We conclude that RECKONING is effective and significantly benefits reasoning tasks using real-world knowledge.

\ Table 2: Evaluation results on FOLIO. We compare RECKONING against the FT-ICR baseline with GPT-2 and two popular large language models.

\

:::info Authors:

(1) Zeming Chen, EPFL (zeming.chen@epfl.ch);

(2) Gail Weiss, EPFL (antoine.bosselut@epfl.ch);

(3) Eric Mitchell, Stanford University (eric.mitchell@cs.stanford.edu)';

(4) Asli Celikyilmaz, Meta AI Research (aslic@meta.com);

(5) Antoine Bosselut, EPFL (antoine.bosselut@epfl.ch).

:::


:::info This paper is available on arxiv under CC BY 4.0 DEED license.

:::

2 https://openai.com/blog/chatgpt

Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact service@support.mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.
Share Insights

You May Also Like

DBS Tests Repo With Ripple RLUSD and Franklin sgBENJI

DBS Tests Repo With Ripple RLUSD and Franklin sgBENJI

The post DBS Tests Repo With Ripple RLUSD and Franklin sgBENJI appeared on BitcoinEthereumNews.com. Ripple, DBS, and Franklin Templeton launch tokenized repo pilot on DBS Exchange. Repo trades use Ripple’s RLUSD stablecoin and Franklin Templeton’s sgBENJI token. sgBENJI issued on XRP Ledger enables fast collateralized lending and settlements. DBS, Ripple, and Franklin Templeton have signed a memorandum of understanding to bring repo transactions into tokenized finance. The framework pairs Ripple’s RLUSD stablecoin with Franklin Templeton’s sgBENJI tokenized money market fund, listed on DBS Digital Exchange. The setup gives accredited clients a path to rebalance cash into a regulated, yield-bearing vehicle while transacting with stablecoins that settle within minutes. For institutions used to overnight repo desks, this is a first look at how traditional liquidity tools can migrate onto public blockchains. Related: Franklin Templeton Launches its DeFi Solution Benji on Ethereum Demand From Institutions Shapes the Design The three firms cited rising demand for digital asset allocations, with surveys showing nearly nine in ten institutional investors plan to increase exposure in 2025. The repo model was chosen because it mirrors an existing backbone of global funding markets: collateralized lending against short-term securities. By allowing RLUSD to trade directly against sgBENJI on DBS Digital Exchange, desks can manage intraday liquidity, park stablecoin reserves into a fund earning regulated yield, and unwind positions quickly when cash is needed. DBS to Expand Collateralized Lending The next phase extends sgBENJI beyond a trading instrument into repo collateral. DBS plans to let investors pledge sgBENJI against credit lines arranged through the bank or third-party lenders. That opens deeper liquidity pools with the assurance that collateral sits inside a regulated balance sheet. For trading desks, that means onchain repo could eventually function like its traditional counterpart, rolling positions overnight, secured by tokenized assets that settle in near real-time. XRP Ledger as the Settlement Rail Franklin Templeton will issue sgBENJI tokens on…
Share
BitcoinEthereumNews2025/09/18 20:25
SBF-Linked Account Posts Document Claiming FTX Was ‘Never Bankrupt’

SBF-Linked Account Posts Document Claiming FTX Was ‘Never Bankrupt’

A social media account once linked to Sam Bankman-Fried, the imprisoned founder of FTX, posted a new document on X late Thursday. The 14-page file argues that the crypto exchange was never genuinely insolvent.Visit Website
Share
Coinstats2025/10/31 14:33