Acts As Censored

by jmbennett
Censor words from content.
I needed to remove swearing etc from DB content for a recent charity campaign, so knocked up this behaviour.


<?php

/**
 * Model behavior to support making any string, URL safe
 *
 * @package app
 * @subpackage app.models.behaviors
 */
class CensoredBehavior extends ModelBehavior
{
    
/**
     * Contain settings indexed by model name.
     *
     * @var array
     * @access private
     */
    
var $__settings = array();
    
    
/**
     * Array containing all bad words to be replaced
     *
     * @var array
     * @access private
     */
    
var $__badWords = array(
        
// These are code flavoured dummies
        
'spaghetti',
        
'inline styles',
        
'inline JavaScript',
        
'table layouts',
        
'procedural',
        
'fat controllers thin models',
        
'interdependencies',
        
'etc'
    
);
    
    
/**
     * Initiate behavior for the model using specified settings. Available settings:
     *
     * - fields: array of fields to search and replace in
     *
     * - type: determines when replace happens
     *                 - 'find' runs afterFind (non-destructive)
     *                 - 'save' runs beforeSave (destructive)
     *                 - 'both' runs after find and before save (obviously!)
     *
     * @param object $Model Model using the behaviour
     * @param array $settings Settings to override for model.
     * @access public
     */
    
function setup(&$Model$settings = array())
    {
        
// stores the name of each field to be replaced
        
$default = array('fields'=>array('name'), 'type'=>'find');
        
        if (!isset(
$this->__settings[$Model->alias]))
        {
            
$this->__settings[$Model->alias] = $default;
        }

        
$this->__settings[$Model->alias] = am($this->__settings[$Model->alias], ife(is_array($settings), $settings, array()));
    }
    
    
/**
     * Runs before a save() operation.
     *
     * @param object $Model    Model using the behaviour
     * @param array $results Results of the find operation.
     * @access public
     */
    
function beforeSave(&$Model)
    {
        
// check field has content    
        
if (!empty($this->__settings[$Model->alias]['fields']) && ($this->__settings[$Model->alias]['type'] == 'save' || $this->__settings[$Model->alias]['type'] == 'both'))
        {
            
// loop through results
            
foreach($Model->data as &$row)
            {
                
// loop through fields
                
foreach($this->__settings[$Model->alias]['fields'] as $field)
                {
                    
// check field exists
                    
if (isset($row[$Model->alias][$field]))
                    {
                        
// replace isntances of each bad word
                        
foreach($this->__badWords as $word)
                        {
                             
$row[$Model->alias][$field] = eregi_replace($word$this->__settings[$Model->alias]['replace'], $row[$Model->alias][$field]);
                        }
                    }
                }
            }
        }
        return 
true;
    }
    
    
/**
     * Runs after a find() operation.
     *
     * @param object $Model    Model using the behaviour
     * @param array $results Results of the find operation.
     * @access public
     */
    
function afterFind(&$Model$results)
    {
        
// check field has content    
        
if (!empty($this->__settings[$Model->alias]['fields']) && ($this->__settings[$Model->alias]['type'] == 'find' || $this->__settings[$Model->alias]['type'] == 'both'))
        {
            
// loop through results
            
foreach($results as &$row)
            {
                
// loop through fields
                
foreach($this->__settings[$Model->alias]['fields'] as $field)
                {
                    
// check field exists
                    
if (isset($row[$Model->alias][$field]))
                    {
                        
// preg replace on an array?
                        
foreach($this->__badWords as $word)
                        {
                            
$row[$Model->alias][$field] = eregi_replace($word$this->__settings[$Model->alias]['replace'], $row[$Model->alias][$field]);
                        }
                    }
                }
            }
        }
        return 
$results;
    }
    
}

?>

Usage:



<?php

class Model extends AppModel
{
    var 
$name 'Model';
    
    var 
$belongsTo                 = array();
    var 
$hasOne                 = array();
    var 
$hasMany                 = array();
    var 
$hasAndBelongsToMany     = array();
    
    var 
$actsAs = array(
        
'censored'=>array(
            
'replace'=>''
            
'fields'=>array('name''body'), 
            
'type'=>'find'
        
)
    );
    
}

?>

Report

More on Behaviors

Advertising

Comments

  • grant_cox posted on 06/17/08 06:32:24 PM
    1. You should not be looping over $Model->data like that in beforeSave - it'll just be a single row at that point. So currently this doesn't work for the 'save' option, only the 'find'.

    2. Why the use of eregi_replace instead of preg_replace? I can't see how to use word boundaries in eregi_replace, which means replacing 'ass' will destroy 'classic'. Also, you are only running the replace once, which means 'asasss' will come through as 'ass' anyway.

    3. It would be good to have some replacement options, like that each offending character is replaced with a *, or even that you can provide replacement options (e.g 'ass'=>'butt'). That'd be clbuttic :)


    But I like the option of only sanitizing on find - less destructive. The problem is that afterFind doesn't run on association behaviours yet... Maybe an afterRender page processing step could work - as it'd process the actual rendered data - so it doesn't matter where the profanity comes from. See http://bakery.cakephp.org/articles/view/tidy-output-filtering for an example of this.
  • AD7six posted on 06/17/08 12:25:43 PM
    Hi Jon,
    I edited your bad word list so that this article could be published.
    Cheers,
    AD
  • mariano posted on 02/27/08 02:36:56 PM
    @Jon: Thanks for sharing such a fun little snippet! We can't have google indexing us with such bad words, so can you please either:
    1. Change the behavior so the list of bad words come from a model specified by the user when configuring the behavior
    2. Remove the list of bad words and instead put 'dummy', 'dummy2' and tell people to modify that.
    Thanks! Look forward to those changes :)
login to post a comment.