Yoshikoder

Written by

in

Yoshikoder is a free, open-source, cross-platform software developed by Will Lowe for Computer-Assisted Text Analysis (CATA). It is highly popular among social scientists, political researchers, and students because it offers a simple, graphical user interface (GUI) to quantitatively analyze text without requiring any programming knowledge. 核心概念:CATA 基础 (Core Concepts of CATA)

In Computer-Assisted Text Analysis (CATA), text is analyzed systematically using pre-defined rules. Yoshikoder relies on three foundational elements:

Documents: Plain text files (.txt in UTF-8 format) containing the speeches, tweets, news articles, or interview transcripts you want to study.

Categories: Analytical buckets or themes (e.g., “Positive Sentiment”, “Economic Policy”, or “Tired”) that you want to measure across your text.

Patterns: Specific keywords or phrases assigned to a category. For example, the patterns “inflation”, “tax”, and “budget” would sit under an “Economy” category.

如何使用 Yoshikoder:入门四大步骤 (Four Steps to Get Started) 1. 准备文本文件 (Prepare Your Documents)

Yoshikoder only processes plain text files encoded in UTF-8.

If your data is currently in PDF, Microsoft Word, or HTML format, you must convert them first. You can use the Yoshikoder Converter or standard text editors to save them as .txt files. 2. 构建或导入词典 (Build or Import a Dictionary)

A dictionary is your code sheet. It holds your categories and their keyword patterns.

To create your own: Open Yoshikoder, add a new Category, and input your chosen Patterns.

Use Wildcards: Use an asterisk () to capture word variations. For instance, entering run will automatically count runner, running, and runs.

To import: You can load pre-existing standard dictionaries saved in XML format. 3. 运行基本分析 (Run the Analysis)

Once your text files and dictionary are loaded, Yoshikoder allows you to run several types of quantitative examinations:

Word Frequencies: View raw summaries and proportions of how often words appear across your documents.

Dictionary Analysis: Apply your custom dictionary to the documents to see which themes dominate the text.

Concordance (KWIC): Generate Key-Word-In-Context (KWIC) lists. This shows you a specific keyword alongside the words immediately preceding and following it, helping you understand the local context of how a word is used. 4. 比较与导出数据 (Compare and Export)

You can compare subsets of text (e.g., comparing the media coverage of two political candidates).

Yoshikoder calculates frequencies, proportions, and statistical comparisons (like relative risk ratios with 95% confidence intervals) to show if differences between texts are statistically significant.

All files, dictionaries, and KWIC outputs are saved in non-proprietary XML formats, making your research fully human-readable and easily shareable.

技术要求与局限性 (System Requirements & Limitations)

Java Dependency: Yoshikoder requires Java Runtime Environment (version 1.8 or later) to run on Windows. Mac users can download a bundled version that does not require a separate Java installation.

Multilingual Tokenization: While it supports multiple languages, languages that do not use spaces between words (like Chinese) require an external tokenizer or segmenter tool before analysis so the software can distinguish individual words.

If you are working on a specific text analysis project, let me know:

What type of texts are you analyzing? (e.g., social media posts, political speeches, interviews) What themes or categories are you hoping to measure?

I can give you specific tips on how to structure your Yoshikoder dictionary! Yoshikoder – Homepage

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

More posts