This study investigates the nature of good and bad prompts extracted from Hugging Face datasets using the LFTK (Linguistic Feature Tool Kit) package and analyzes the correlation of the linguistic features of good and bad prompts through Spearman’s R...
This study investigates the nature of good and bad prompts extracted from Hugging Face datasets using the LFTK (Linguistic Feature Tool Kit) package and analyzes the correlation of the linguistic features of good and bad prompts through Spearman’s Rank Correlation Analysis. Then, this study is to provide efficient prompting strategies from the perspective of linguistics, in order to reduce hallucination generated by Artificial Intelligence which contains false or misleading information. The results of this paper reveal that certain linguistic features, such as linguistic units (i.e., characters, syllables, words, categories), text difficulty, and lexical diversity, can affect the quality of prompts. Specifically, the length of sentences and words, word difficulty, the use of nouns and determiners, and the frequency of commonly used words are found to be significant factors when prompting. In conclusion, this paper may contribute to providing Generative AI users with guidelines on how to make linguistically well-crafted prompts with high quality.